LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* timers: Move clearing of base::timer_running under base::lock
@ 2020-12-06 21:40 Thomas Gleixner
2020-12-07 1:10 ` Frederic Weisbecker
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:40 UTC (permalink / raw)
To: LKML
Cc: Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
Anna-Maria Behnsen, Sebastian Andrzej Siewior
syzbot reported KCSAN data races vs. timer_base::timer_running being set to
NULL without holding base::lock in expire_timers().
This looks innocent and most reads are clearly not problematic but for a
non-RT kernel it's completely irrelevant whether the store happens before
or after taking the lock. For an RT kernel moving the store under the lock
requires an extra unlock/lock pair in the case that there is a waiter for
the timer. But that's not the end of the world and definitely not worth the
trouble of adding boatloads of comments and annotations to the code. Famous
last words...
Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
kernel/time/timer.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1263,8 +1263,10 @@ static inline void timer_base_unlock_exp
static void timer_sync_wait_running(struct timer_base *base)
{
if (atomic_read(&base->timer_waiters)) {
+ raw_spin_unlock_irq(&base->lock);
spin_unlock(&base->expiry_lock);
spin_lock(&base->expiry_lock);
+ raw_spin_lock_irq(&base->lock);
}
}
@@ -1448,14 +1450,14 @@ static void expire_timers(struct timer_b
if (timer->flags & TIMER_IRQSAFE) {
raw_spin_unlock(&base->lock);
call_timer_fn(timer, fn, baseclk);
- base->running_timer = NULL;
raw_spin_lock(&base->lock);
+ base->running_timer = NULL;
} else {
raw_spin_unlock_irq(&base->lock);
call_timer_fn(timer, fn, baseclk);
+ raw_spin_lock_irq(&base->lock);
base->running_timer = NULL;
timer_sync_wait_running(base);
- raw_spin_lock_irq(&base->lock);
}
}
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-06 21:40 timers: Move clearing of base::timer_running under base::lock Thomas Gleixner
@ 2020-12-07 1:10 ` Frederic Weisbecker
2020-12-07 12:25 ` Peter Zijlstra
2020-12-07 13:07 ` Sebastian Andrzej Siewior
2021-07-27 19:00 ` [tip: timers/urgent] timers: Move clearing of base::timer_running under base:: Lock tip-bot2 for Thomas Gleixner
2 siblings, 1 reply; 13+ messages in thread
From: Frederic Weisbecker @ 2020-12-07 1:10 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
Anna-Maria Behnsen, Sebastian Andrzej Siewior
On Sun, Dec 06, 2020 at 10:40:07PM +0100, Thomas Gleixner wrote:
> syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> NULL without holding base::lock in expire_timers().
>
> This looks innocent and most reads are clearly not problematic but for a
> non-RT kernel it's completely irrelevant whether the store happens before
> or after taking the lock. For an RT kernel moving the store under the lock
> requires an extra unlock/lock pair in the case that there is a waiter for
> the timer. But that's not the end of the world and definitely not worth the
> trouble of adding boatloads of comments and annotations to the code. Famous
> last words...
There is another thing I noticed lately wrt. del_timer_sync() VS timer execution:
int data = 0;
void timer_func(struct timer_list *t)
{
data = 1;
}
CPU 0 CPU 1
------------------------------ --------------------------
base = lock_timer_base(timer, &flags); raw_spin_unlock(&base->lock);
if (base->running_timer != timer) call_timer_fn(timer, fn, baseclk);
ret = detach_if_pending(timer, base, true); base->running_timer = NULL;
raw_spin_unlock_irqrestore(&base->lock, flags); raw_spin_lock(&base->lock);
x = data;
Here if the timer has previously executed on CPU 1 and then CPU 0 sees base->running_timer == NULL,
it will return, assuming the timer has completed. But there is nothing to enforce the fact that x
will be equal to 1. Enforcing that is a behaviour I would expect in this case since this is a kind
of "wait for completion" function. But perhaps it doesn't apply here, in fact I have no idea...
But if we recognize that as an issue, we would need a mirroring load_acquire()/store_release() on
base->running_timer.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-07 1:10 ` Frederic Weisbecker
@ 2020-12-07 12:25 ` Peter Zijlstra
2020-12-07 12:49 ` Frederic Weisbecker
0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2020-12-07 12:25 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Paul E. McKenney,
Anna-Maria Behnsen, Sebastian Andrzej Siewior
On Mon, Dec 07, 2020 at 02:10:13AM +0100, Frederic Weisbecker wrote:
> On Sun, Dec 06, 2020 at 10:40:07PM +0100, Thomas Gleixner wrote:
> > syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> > NULL without holding base::lock in expire_timers().
> >
> > This looks innocent and most reads are clearly not problematic but for a
> > non-RT kernel it's completely irrelevant whether the store happens before
> > or after taking the lock. For an RT kernel moving the store under the lock
> > requires an extra unlock/lock pair in the case that there is a waiter for
> > the timer. But that's not the end of the world and definitely not worth the
> > trouble of adding boatloads of comments and annotations to the code. Famous
> > last words...
>
> There is another thing I noticed lately wrt. del_timer_sync() VS timer execution:
> Here if the timer has previously executed on CPU 1 and then CPU 0 sees base->running_timer == NULL,
> it will return, assuming the timer has completed. But there is nothing to enforce the fact that x
> will be equal to 1. Enforcing that is a behaviour I would expect in this case since this is a kind
> of "wait for completion" function. But perhaps it doesn't apply here, in fact I have no idea...
>
> But if we recognize that as an issue, we would need a mirroring load_acquire()/store_release() on
> base->running_timer.
Yeah, I think you're right. del_timer_sync() explicitly states it waits
for completion of the handler, so it isn't weird to then also expect to
be able to observe the results of the handler.
Thomas' patch fixes this.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-07 12:25 ` Peter Zijlstra
@ 2020-12-07 12:49 ` Frederic Weisbecker
0 siblings, 0 replies; 13+ messages in thread
From: Frederic Weisbecker @ 2020-12-07 12:49 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Paul E. McKenney,
Anna-Maria Behnsen, Sebastian Andrzej Siewior
On Mon, Dec 07, 2020 at 01:25:13PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 07, 2020 at 02:10:13AM +0100, Frederic Weisbecker wrote:
> > On Sun, Dec 06, 2020 at 10:40:07PM +0100, Thomas Gleixner wrote:
> > > syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> > > NULL without holding base::lock in expire_timers().
> > >
> > > This looks innocent and most reads are clearly not problematic but for a
> > > non-RT kernel it's completely irrelevant whether the store happens before
> > > or after taking the lock. For an RT kernel moving the store under the lock
> > > requires an extra unlock/lock pair in the case that there is a waiter for
> > > the timer. But that's not the end of the world and definitely not worth the
> > > trouble of adding boatloads of comments and annotations to the code. Famous
> > > last words...
> >
> > There is another thing I noticed lately wrt. del_timer_sync() VS timer execution:
>
> > Here if the timer has previously executed on CPU 1 and then CPU 0 sees base->running_timer == NULL,
> > it will return, assuming the timer has completed. But there is nothing to enforce the fact that x
> > will be equal to 1. Enforcing that is a behaviour I would expect in this case since this is a kind
> > of "wait for completion" function. But perhaps it doesn't apply here, in fact I have no idea...
> >
> > But if we recognize that as an issue, we would need a mirroring load_acquire()/store_release() on
> > base->running_timer.
>
> Yeah, I think you're right. del_timer_sync() explicitly states it waits
> for completion of the handler, so it isn't weird to then also expect to
> be able to observe the results of the handler.
>
> Thomas' patch fixes this.
Indeed!
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-06 21:40 timers: Move clearing of base::timer_running under base::lock Thomas Gleixner
2020-12-07 1:10 ` Frederic Weisbecker
@ 2020-12-07 13:07 ` Sebastian Andrzej Siewior
2020-12-07 14:29 ` Thomas Gleixner
2021-07-27 19:00 ` [tip: timers/urgent] timers: Move clearing of base::timer_running under base:: Lock tip-bot2 for Thomas Gleixner
2 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-07 13:07 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
Anna-Maria Behnsen
On 2020-12-06 22:40:07 [+0100], Thomas Gleixner wrote:
> syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> NULL without holding base::lock in expire_timers().
>
> This looks innocent and most reads are clearly not problematic but for a
> non-RT kernel it's completely irrelevant whether the store happens before
> or after taking the lock. For an RT kernel moving the store under the lock
> requires an extra unlock/lock pair in the case that there is a waiter for
> the timer. But that's not the end of the world and definitely not worth the
> trouble of adding boatloads of comments and annotations to the code. Famous
> last words...
>
> Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
> Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
One thing I noticed while testing it is that the "corner" case in
timer_sync_wait_running() is quite reliably hit by rcu_preempt
rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.
Sebastian
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-07 13:07 ` Sebastian Andrzej Siewior
@ 2020-12-07 14:29 ` Thomas Gleixner
2020-12-07 15:25 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-12-07 14:29 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
Anna-Maria Behnsen
On Mon, Dec 07 2020 at 14:07, Sebastian Andrzej Siewior wrote:
> On 2020-12-06 22:40:07 [+0100], Thomas Gleixner wrote:
>> syzbot reported KCSAN data races vs. timer_base::timer_running being set to
>> NULL without holding base::lock in expire_timers().
>>
>> This looks innocent and most reads are clearly not problematic but for a
>> non-RT kernel it's completely irrelevant whether the store happens before
>> or after taking the lock. For an RT kernel moving the store under the lock
>> requires an extra unlock/lock pair in the case that there is a waiter for
>> the timer. But that's not the end of the world and definitely not worth the
>> trouble of adding boatloads of comments and annotations to the code. Famous
>> last words...
>>
>> Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
>> Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> One thing I noticed while testing it is that the "corner" case in
> timer_sync_wait_running() is quite reliably hit by rcu_preempt
> rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.
I assume it's something like this:
timeout -> wakeup
->preemption
del_timer_sync()
.....
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-07 14:29 ` Thomas Gleixner
@ 2020-12-07 15:25 ` Sebastian Andrzej Siewior
2020-12-07 16:06 ` Paul E. McKenney
0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-07 15:25 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
Anna-Maria Behnsen
On 2020-12-07 15:29:50 [+0100], Thomas Gleixner wrote:
> On Mon, Dec 07 2020 at 14:07, Sebastian Andrzej Siewior wrote:
> > One thing I noticed while testing it is that the "corner" case in
> > timer_sync_wait_running() is quite reliably hit by rcu_preempt
> > rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.
>
> I assume it's something like this:
>
> timeout -> wakeup
>
> ->preemption
> del_timer_sync()
> .....
Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
the timeout.
| <idle>-0 [007] dN.h4.. 46.299705: sched_wakeup: comm=rcuc/7 pid=53 prio=98 target_cpu=007
| <idle>-0 [007] d...2.. 46.299728: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcuc/7 next_pid=53 next_prio=98
| rcuc/7-53 [007] d...2.. 46.299742: sched_switch: prev_comm=rcuc/7 prev_pid=53 prev_prio=98 prev_state=S ==> next_comm=ksoftirqd/7 next_pid=54 next_prio=120
| ksoftirqd/7-54 [007] .....13 46.299750: timer_expire_entry: timer=000000003bd1e045 function=process_timeout now=4294903802 baseclk=4294903802
| ksoftirqd/7-54 [007] d...213 46.299750: sched_waking: comm=rcu_preempt pid=11 prio=98 target_cpu=007
| ksoftirqd/7-54 [007] dN..313 46.299754: sched_wakeup: comm=rcu_preempt pid=11 prio=98 target_cpu=007
| ksoftirqd/7-54 [007] dN..213 46.299756: sched_stat_runtime: comm=ksoftirqd/7 pid=54 runtime=13265 [ns] vruntime=3012610540 [ns]
| ksoftirqd/7-54 [007] d...213 46.299760: sched_switch: prev_comm=ksoftirqd/7 prev_pid=54 prev_prio=120 prev_state=R+ ==> next_comm=rcu_preempt next_pid=11 next_prio=98
| rcu_preempt-11 [007] d...311 46.299766: sched_pi_setprio: comm=ksoftirqd/7 pid=54 oldprio=120 newprio=98
del_timer_sync()
| rcu_preempt-11 [007] d...211 46.299773: sched_switch: prev_comm=rcu_preempt prev_pid=11 prev_prio=98 prev_state=R+ ==> next_comm=ksoftirqd/7 next_pid=54 next_prio=98
| ksoftirqd/7-54 [007] .....13 46.299774: timer_expire_exit: timer=000000003bd1e045
| ksoftirqd/7-54 [007] dN..311 46.299784: sched_pi_setprio: comm=ksoftirqd/7 pid=54 oldprio=98 newprio=120
| ksoftirqd/7-54 [007] dN..311 46.299788: sched_waking: comm=rcu_preempt pid=11 prio=98 target_cpu=007
| ksoftirqd/7-54 [007] dN..411 46.299790: sched_wakeup: comm=rcu_preempt pid=11 prio=98 target_cpu=007
| ksoftirqd/7-54 [007] dN..311 46.299792: sched_stat_runtime: comm=ksoftirqd/7 pid=54 runtime=7404 [ns] vruntime=3012617944 [ns]
| ksoftirqd/7-54 [007] d...2.. 46.299797: sched_switch: prev_comm=ksoftirqd/7 prev_pid=54 prev_prio=120 prev_state=S ==> next_comm=rcu_preempt next_pid=11 next_prio=98
> Thanks,
>
> tglx
Sebastian
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-07 15:25 ` Sebastian Andrzej Siewior
@ 2020-12-07 16:06 ` Paul E. McKenney
2020-12-08 8:50 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2020-12-07 16:06 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Peter Zijlstra,
Anna-Maria Behnsen
On Mon, Dec 07, 2020 at 04:25:33PM +0100, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 15:29:50 [+0100], Thomas Gleixner wrote:
> > On Mon, Dec 07 2020 at 14:07, Sebastian Andrzej Siewior wrote:
> > > One thing I noticed while testing it is that the "corner" case in
> > > timer_sync_wait_running() is quite reliably hit by rcu_preempt
> > > rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.
> >
> > I assume it's something like this:
> >
> > timeout -> wakeup
> >
> > ->preemption
> > del_timer_sync()
> > .....
>
> Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
> the timeout.
Given that a lot of RCU processing is event-driven based on timers,
and given that the scheduling-clock interrupts are synchronized for
energy-efficiency reasons on many configs, maybe this alignment is
expected behavior?
Thanx, Paul
> | <idle>-0 [007] dN.h4.. 46.299705: sched_wakeup: comm=rcuc/7 pid=53 prio=98 target_cpu=007
> | <idle>-0 [007] d...2.. 46.299728: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcuc/7 next_pid=53 next_prio=98
> | rcuc/7-53 [007] d...2.. 46.299742: sched_switch: prev_comm=rcuc/7 prev_pid=53 prev_prio=98 prev_state=S ==> next_comm=ksoftirqd/7 next_pid=54 next_prio=120
> | ksoftirqd/7-54 [007] .....13 46.299750: timer_expire_entry: timer=000000003bd1e045 function=process_timeout now=4294903802 baseclk=4294903802
> | ksoftirqd/7-54 [007] d...213 46.299750: sched_waking: comm=rcu_preempt pid=11 prio=98 target_cpu=007
> | ksoftirqd/7-54 [007] dN..313 46.299754: sched_wakeup: comm=rcu_preempt pid=11 prio=98 target_cpu=007
> | ksoftirqd/7-54 [007] dN..213 46.299756: sched_stat_runtime: comm=ksoftirqd/7 pid=54 runtime=13265 [ns] vruntime=3012610540 [ns]
> | ksoftirqd/7-54 [007] d...213 46.299760: sched_switch: prev_comm=ksoftirqd/7 prev_pid=54 prev_prio=120 prev_state=R+ ==> next_comm=rcu_preempt next_pid=11 next_prio=98
> | rcu_preempt-11 [007] d...311 46.299766: sched_pi_setprio: comm=ksoftirqd/7 pid=54 oldprio=120 newprio=98
> del_timer_sync()
> | rcu_preempt-11 [007] d...211 46.299773: sched_switch: prev_comm=rcu_preempt prev_pid=11 prev_prio=98 prev_state=R+ ==> next_comm=ksoftirqd/7 next_pid=54 next_prio=98
> | ksoftirqd/7-54 [007] .....13 46.299774: timer_expire_exit: timer=000000003bd1e045
> | ksoftirqd/7-54 [007] dN..311 46.299784: sched_pi_setprio: comm=ksoftirqd/7 pid=54 oldprio=98 newprio=120
> | ksoftirqd/7-54 [007] dN..311 46.299788: sched_waking: comm=rcu_preempt pid=11 prio=98 target_cpu=007
> | ksoftirqd/7-54 [007] dN..411 46.299790: sched_wakeup: comm=rcu_preempt pid=11 prio=98 target_cpu=007
> | ksoftirqd/7-54 [007] dN..311 46.299792: sched_stat_runtime: comm=ksoftirqd/7 pid=54 runtime=7404 [ns] vruntime=3012617944 [ns]
> | ksoftirqd/7-54 [007] d...2.. 46.299797: sched_switch: prev_comm=ksoftirqd/7 prev_pid=54 prev_prio=120 prev_state=S ==> next_comm=rcu_preempt next_pid=11 next_prio=98
>
>
> > Thanks,
> >
> > tglx
>
> Sebastian
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-07 16:06 ` Paul E. McKenney
@ 2020-12-08 8:50 ` Sebastian Andrzej Siewior
2020-12-08 15:04 ` Paul E. McKenney
2020-12-11 14:36 ` Thomas Gleixner
0 siblings, 2 replies; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-08 8:50 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Peter Zijlstra,
Anna-Maria Behnsen
On 2020-12-07 08:06:48 [-0800], Paul E. McKenney wrote:
> > Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
> > the timeout.
>
> Given that a lot of RCU processing is event-driven based on timers,
> and given that the scheduling-clock interrupts are synchronized for
> energy-efficiency reasons on many configs, maybe this alignment is
> expected behavior?
No, it is the fact that rcu_preempt has a higher priority than
ksoftirqd. So immediately after the wakeup (of rcu_preempt) there is a
context switch and expire_timers() has this:
| raw_spin_unlock_irq(&base->lock);
| call_timer_fn(timer, fn, baseclk);
| raw_spin_lock_irq(&base->lock);
| base->running_timer = NULL;
| timer_sync_wait_running(base);
So ->running_timer isn't reset and try_to_del_timer_sync() (that
del_timer_sync() from schedule_timeout()) returns -1 and then the corner
case is handled where `expiry_lock' is acquired. So everything goes as
expected.
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-08 8:50 ` Sebastian Andrzej Siewior
@ 2020-12-08 15:04 ` Paul E. McKenney
2020-12-11 14:36 ` Thomas Gleixner
1 sibling, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2020-12-08 15:04 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Peter Zijlstra,
Anna-Maria Behnsen
On Tue, Dec 08, 2020 at 09:50:49AM +0100, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 08:06:48 [-0800], Paul E. McKenney wrote:
> > > Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
> > > the timeout.
> >
> > Given that a lot of RCU processing is event-driven based on timers,
> > and given that the scheduling-clock interrupts are synchronized for
> > energy-efficiency reasons on many configs, maybe this alignment is
> > expected behavior?
>
> No, it is the fact that rcu_preempt has a higher priority than
> ksoftirqd. So immediately after the wakeup (of rcu_preempt) there is a
> context switch and expire_timers() has this:
>
> | raw_spin_unlock_irq(&base->lock);
> | call_timer_fn(timer, fn, baseclk);
> | raw_spin_lock_irq(&base->lock);
> | base->running_timer = NULL;
> | timer_sync_wait_running(base);
>
> So ->running_timer isn't reset and try_to_del_timer_sync() (that
> del_timer_sync() from schedule_timeout()) returns -1 and then the corner
> case is handled where `expiry_lock' is acquired. So everything goes as
> expected.
Makes sense! Thank you for the explanation!
Thanx, Paul
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-08 8:50 ` Sebastian Andrzej Siewior
2020-12-08 15:04 ` Paul E. McKenney
@ 2020-12-11 14:36 ` Thomas Gleixner
2020-12-11 15:04 ` Sebastian Andrzej Siewior
1 sibling, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-12-11 14:36 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, Paul E. McKenney
Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Anna-Maria Behnsen
On Tue, Dec 08 2020 at 09:50, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 08:06:48 [-0800], Paul E. McKenney wrote:
>> > Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
>> > the timeout.
>>
>> Given that a lot of RCU processing is event-driven based on timers,
>> and given that the scheduling-clock interrupts are synchronized for
>> energy-efficiency reasons on many configs, maybe this alignment is
>> expected behavior?
>
> No, it is the fact that rcu_preempt has a higher priority than
> ksoftirqd. So immediately after the wakeup (of rcu_preempt) there is a
> context switch and expire_timers() has this:
>
> | raw_spin_unlock_irq(&base->lock);
> | call_timer_fn(timer, fn, baseclk);
> | raw_spin_lock_irq(&base->lock);
> | base->running_timer = NULL;
> | timer_sync_wait_running(base);
>
> So ->running_timer isn't reset and try_to_del_timer_sync() (that
> del_timer_sync() from schedule_timeout()) returns -1 and then the corner
> case is handled where `expiry_lock' is acquired. So everything goes as
> expected.
Well, but even without that change you have the same situation:
timer_fn()
wakeup()
-->preemption
del_timer_sync()
if (running)
wait_for_running()
lock(expiry)
running = NULL
sync_wait_running()
unlock(expiry)
wakeup_lock()
-->preemption
...
lock(base)
So the change at hand does not make things worse, right?
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: timers: Move clearing of base::timer_running under base::lock
2020-12-11 14:36 ` Thomas Gleixner
@ 2020-12-11 15:04 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-11 15:04 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Paul E. McKenney, LKML, Marco Elver, kasan-dev, Peter Zijlstra,
Anna-Maria Behnsen
On 2020-12-11 15:36:27 [+0100], Thomas Gleixner wrote:
> So the change at hand does not make things worse, right?
Correct. This *ping*-*pong* was there before this patch because
->running_timer is always cleared after the callback finishes.
I was just curious why out of a sudden there are *that* many users of
the corner case.
> Thanks,
>
> tglx
Sebastian
^ permalink raw reply [flat|nested] 13+ messages in thread
* [tip: timers/urgent] timers: Move clearing of base::timer_running under base:: Lock
2020-12-06 21:40 timers: Move clearing of base::timer_running under base::lock Thomas Gleixner
2020-12-07 1:10 ` Frederic Weisbecker
2020-12-07 13:07 ` Sebastian Andrzej Siewior
@ 2021-07-27 19:00 ` tip-bot2 for Thomas Gleixner
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-07-27 19:00 UTC (permalink / raw)
To: linux-tip-commits
Cc: syzbot+aa7c2385d46c5eba0b89, syzbot+abea4558531bae1ba9fe,
Thomas Gleixner, Sebastian Andrzej Siewior, stable, x86,
linux-kernel
The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: bb7262b295472eb6858b5c49893954794027cd84
Gitweb: https://git.kernel.org/tip/bb7262b295472eb6858b5c49893954794027cd84
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Sun, 06 Dec 2020 22:40:07 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 27 Jul 2021 20:57:44 +02:00
timers: Move clearing of base::timer_running under base:: Lock
syzbot reported KCSAN data races vs. timer_base::timer_running being set to
NULL without holding base::lock in expire_timers().
This looks innocent and most reads are clearly not problematic, but
Frederic identified an issue which is:
int data = 0;
void timer_func(struct timer_list *t)
{
data = 1;
}
CPU 0 CPU 1
------------------------------ --------------------------
base = lock_timer_base(timer, &flags); raw_spin_unlock(&base->lock);
if (base->running_timer != timer) call_timer_fn(timer, fn, baseclk);
ret = detach_if_pending(timer, base, true); base->running_timer = NULL;
raw_spin_unlock_irqrestore(&base->lock, flags); raw_spin_lock(&base->lock);
x = data;
If the timer has previously executed on CPU 1 and then CPU 0 can observe
base->running_timer == NULL and returns, assuming the timer has completed,
but it's not guaranteed on all architectures. The comment for
del_timer_sync() makes that guarantee. Moving the assignment under
base->lock prevents this.
For non-RT kernel it's performance wise completely irrelevant whether the
store happens before or after taking the lock. For an RT kernel moving the
store under the lock requires an extra unlock/lock pair in the case that
there is a waiter for the timer, but that's not the end of the world.
Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
Fixes: 030dcdd197d7 ("timers: Prepare support for PREEMPT_RT")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/87lfea7gw8.fsf@nanos.tec.linutronix.de
Cc: stable@vger.kernel.org
---
kernel/time/timer.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 9eb11c2..e3d2c23 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1265,8 +1265,10 @@ static inline void timer_base_unlock_expiry(struct timer_base *base)
static void timer_sync_wait_running(struct timer_base *base)
{
if (atomic_read(&base->timer_waiters)) {
+ raw_spin_unlock_irq(&base->lock);
spin_unlock(&base->expiry_lock);
spin_lock(&base->expiry_lock);
+ raw_spin_lock_irq(&base->lock);
}
}
@@ -1457,14 +1459,14 @@ static void expire_timers(struct timer_base *base, struct hlist_head *head)
if (timer->flags & TIMER_IRQSAFE) {
raw_spin_unlock(&base->lock);
call_timer_fn(timer, fn, baseclk);
- base->running_timer = NULL;
raw_spin_lock(&base->lock);
+ base->running_timer = NULL;
} else {
raw_spin_unlock_irq(&base->lock);
call_timer_fn(timer, fn, baseclk);
+ raw_spin_lock_irq(&base->lock);
base->running_timer = NULL;
timer_sync_wait_running(base);
- raw_spin_lock_irq(&base->lock);
}
}
}
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-07-27 19:00 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-06 21:40 timers: Move clearing of base::timer_running under base::lock Thomas Gleixner
2020-12-07 1:10 ` Frederic Weisbecker
2020-12-07 12:25 ` Peter Zijlstra
2020-12-07 12:49 ` Frederic Weisbecker
2020-12-07 13:07 ` Sebastian Andrzej Siewior
2020-12-07 14:29 ` Thomas Gleixner
2020-12-07 15:25 ` Sebastian Andrzej Siewior
2020-12-07 16:06 ` Paul E. McKenney
2020-12-08 8:50 ` Sebastian Andrzej Siewior
2020-12-08 15:04 ` Paul E. McKenney
2020-12-11 14:36 ` Thomas Gleixner
2020-12-11 15:04 ` Sebastian Andrzej Siewior
2021-07-27 19:00 ` [tip: timers/urgent] timers: Move clearing of base::timer_running under base:: Lock tip-bot2 for Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).