LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [ANNOUNCE] v5.14-rc3-rt1
@ 2021-07-30 11:07 Sebastian Andrzej Siewior
  2021-07-30 15:21 ` v5.14-rc3-rt1 losing wakeups? Mike Galbraith
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-07-30 11:07 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, linux-rt-users, Steven Rostedt

Dear RT folks!

I'm pleased to announce the v5.14-rc3-rt1 patch set. 

Changes since v5.13-rc3-rt1:

  - Update to v5.14-rc3

  - Update the locking bits.

  - Update the mm/slub bits from Vlastimil Babka. The SLUB_CPU_PARTIAL
    can now be enabled/ disabled. There is a significant hackbench
    related regression due to the rework compared to the old SLUB bits
    we had. However I haven't seen anything in real-world workload. If
    there is anything, please let me know.

  - The "Memory controller" (CONFIG_MEMCG) has been disabled due to
    optimisations which it received as it made it incompatible with
    PREEMPT_RT.

  - A RCU and ARM64 warning has been fixed by Valentin Schneider. It is
    still not clear if the RCU related change is correct.

  - A fix for a ARM64 regression which popped up in the v5.13 released
    that led to a freeze while starting the init process.

Known issues
     - netconsole triggers WARN.

You can get this release via the git tree at:

    git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git v5.14-rc3-rt1

The RT patch against v5.14-rc3 can be found here:

    https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.14/older/patch-5.14-rc3-rt1.patch.xz

The split quilt queue is available at:

    https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.14/older/patches-5.14-rc3-rt1.tar.xz

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* v5.14-rc3-rt1 losing wakeups?
  2021-07-30 11:07 [ANNOUNCE] v5.14-rc3-rt1 Sebastian Andrzej Siewior
@ 2021-07-30 15:21 ` Mike Galbraith
  2021-07-30 20:49   ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Galbraith @ 2021-07-30 15:21 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: LKML, linux-rt-users, Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 1358 bytes --]

On Fri, 2021-07-30 at 13:07 +0200, Sebastian Andrzej Siewior wrote:
> Dear RT folks!
>
> I'm pleased to announce the v5.14-rc3-rt1 patch set.

Damn, I was hoping to figure out wth is going on before the 62 patch
version of tglx/rtmutex branch made its way out into the big wide
world, but alas, I was too slow.

I started meeting GUI hangs as soon as I merged the 62 patch series
into my 5.13-rt based master tree.  I took tglx/rtmutex (977db8e523f5)
back to 5.13-rt to make sure it wasn't some booboo I had made in the
rolled forward tree, but the hangs followed the backport, and I just
met them in virgin v5.14-rc3-rt1, so unfortunately it wasn't some local
booboo, there's a bug lurking.  Maybe a config sensitive one, as what
I'm seeing on my box seems unlikely to escape into the wild otherwise.

First symptom is KDE/Plasma's task manager going comatose.  Notice soon
enough with a !lockdep kernel, and I can crashdump, or rather I can
with my 5.13-rt based kernel, a dump from shiny new rt1 can't be loaded
by crash for some reason.  The local tree dumps I've looked at haven't
been helpful anyway, box at a glance looks fine.  With lockdep enabled,
a failing kernel gets so buggered it isn't even able to crashdump.

config attached.  Oh, a lockdep enabled kernel fails sooner, but both
fail here fairly quickly.

	-Mike

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 39257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-07-30 15:21 ` v5.14-rc3-rt1 losing wakeups? Mike Galbraith
@ 2021-07-30 20:49   ` Thomas Gleixner
  2021-07-31  1:03     ` Mike Galbraith
  2021-08-01  3:36     ` Mike Galbraith
  0 siblings, 2 replies; 13+ messages in thread
From: Thomas Gleixner @ 2021-07-30 20:49 UTC (permalink / raw)
  To: Mike Galbraith, Sebastian Andrzej Siewior
  Cc: LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

Mike,

On Fri, Jul 30 2021 at 17:21, Mike Galbraith wrote:
> On Fri, 2021-07-30 at 13:07 +0200, Sebastian Andrzej Siewior wrote:
>> Dear RT folks!
>>
>> I'm pleased to announce the v5.14-rc3-rt1 patch set.
>
> Damn, I was hoping to figure out wth is going on before the 62 patch
> version of tglx/rtmutex branch made its way out into the big wide
> world, but alas, I was too slow.
>
> I started meeting GUI hangs as soon as I merged the 62 patch series
> into my 5.13-rt based master tree.  I took tglx/rtmutex (977db8e523f5)
> back to 5.13-rt to make sure it wasn't some booboo I had made in the
> rolled forward tree, but the hangs followed the backport, and I just
> met them in virgin v5.14-rc3-rt1, so unfortunately it wasn't some local
> booboo, there's a bug lurking.  Maybe a config sensitive one, as what
> I'm seeing on my box seems unlikely to escape into the wild otherwise.
>
> First symptom is KDE/Plasma's task manager going comatose.  Notice soon

KDE/Plasma points at the new fangled rtmutex based ww_mutex from
Peter. I tried to test the heck out of it...

Which graphics driver is in use on that machine?

> been helpful anyway, box at a glance looks fine.  With lockdep enabled,
> a failing kernel gets so buggered it isn't even able to crashdump.

Ouch.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-07-30 20:49   ` Thomas Gleixner
@ 2021-07-31  1:03     ` Mike Galbraith
  2021-07-31  3:33       ` Mike Galbraith
  2021-08-01  3:36     ` Mike Galbraith
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Galbraith @ 2021-07-31  1:03 UTC (permalink / raw)
  To: Thomas Gleixner, Sebastian Andrzej Siewior
  Cc: LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote:
> >
> > First symptom is KDE/Plasma's task manager going comatose.  Notice soon
>
> KDE/Plasma points at the new fangled rtmutex based ww_mutex from
> Peter.

Yeah, that seems to be most of the delta from 50->62 [now 63] commit
delta, and the 50 turning out to be what I was already running.  I'm
staring, but as yet, nothing has poked me in the eye.

>  I tried to test the heck out of it...

And futextests, rt-tests etc work fine, just stay away from GUI.

> Which graphics driver is in use on that machine?

The all too often problem child nouveau, but I don't _think_ it's
playing a roll in this, as nomodeset hangs as well.  I'll stuff it into
lappy (i915) to double check that.

	-Mike


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-07-31  1:03     ` Mike Galbraith
@ 2021-07-31  3:33       ` Mike Galbraith
  2021-07-31  8:50         ` Mike Galbraith
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Galbraith @ 2021-07-31  3:33 UTC (permalink / raw)
  To: Thomas Gleixner, Sebastian Andrzej Siewior
  Cc: LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On Sat, 2021-07-31 at 03:03 +0200, Mike Galbraith wrote:
> On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote:
> >
>
> > Which graphics driver is in use on that machine?
>
> The all too often problem child nouveau, but I don't _think_ it's
> playing a roll in this, as nomodeset hangs as well.  I'll stuff it into
> lappy (i915) to double check that.

Hm, lappy doesn't seem to want to hang.  i915 isn't virgin, it needing
a couple bandaids, but I doubt my grubby fingerprints matter.

The same kernel in a KVM that's essentially a mirror of my desktop box
(as is lappy), running under a stable kernel didn't even get the GUI
fully up before hanging.  I was able to login and crash it via virsh,
and as with previous dumps, there's nothing running except the shell
that's me nuking it.  All the GUI goop looks rather sleeping beauty
like.. no evil stepmothers in sight, just snoozing away.

	-Mike


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-07-31  3:33       ` Mike Galbraith
@ 2021-07-31  8:50         ` Mike Galbraith
  0 siblings, 0 replies; 13+ messages in thread
From: Mike Galbraith @ 2021-07-31  8:50 UTC (permalink / raw)
  To: Thomas Gleixner, Sebastian Andrzej Siewior
  Cc: LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On Sat, 2021-07-31 at 05:33 +0200, Mike Galbraith wrote:
> On Sat, 2021-07-31 at 03:03 +0200, Mike Galbraith wrote:
> > On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote:
> > >
> >
> > > Which graphics driver is in use on that machine?
> >
> > The all too often problem child nouveau, but I don't _think_ it's
> > playing a roll in this, as nomodeset hangs as well.  I'll stuff it into
> > lappy (i915) to double check that.
>
> Hm, lappy doesn't seem to want to hang.  i915 isn't virgin, it needing
> a couple bandaids, but I doubt my grubby fingerprints matter.

Hours later i915 lappy is still working fine.

The problem is not my nvidia/nouveau desktop box. Taking KVM images
over to lappy on a USB SSD, the same QXL vbox that hangs with desktop
box as host also hangs with lappy as host.. a host that's still running
the very same kernel, the same build in fact (all instances are), just
fine.

	-Mike


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-07-30 20:49   ` Thomas Gleixner
  2021-07-31  1:03     ` Mike Galbraith
@ 2021-08-01  3:36     ` Mike Galbraith
  2021-08-01 15:14       ` Mike Galbraith
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Galbraith @ 2021-08-01  3:36 UTC (permalink / raw)
  To: Thomas Gleixner, Sebastian Andrzej Siewior
  Cc: LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote:
> >
> > First symptom is KDE/Plasma's task manager going comatose.  Notice soon
>
> KDE/Plasma points at the new fangled rtmutex based ww_mutex from
> Peter.

Seems not.  When booting KVM box with nomodeset, there's exactly one
early boot ww_mutex lock/unlock, ancient history at the failure point.

	-Mike


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-08-01  3:36     ` Mike Galbraith
@ 2021-08-01 15:14       ` Mike Galbraith
  2021-08-02  7:02         ` Sebastian Andrzej Siewior
  2021-08-02  9:12         ` Thomas Gleixner
  0 siblings, 2 replies; 13+ messages in thread
From: Mike Galbraith @ 2021-08-01 15:14 UTC (permalink / raw)
  To: Thomas Gleixner, Sebastian Andrzej Siewior
  Cc: LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On Sun, 2021-08-01 at 05:36 +0200, Mike Galbraith wrote:
> On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote:
> > >
> > > First symptom is KDE/Plasma's task manager going comatose.  Notice soon
> >
> > KDE/Plasma points at the new fangled rtmutex based ww_mutex from
> > Peter.
>
> Seems not.  When booting KVM box with nomodeset, there's exactly one
> early boot ww_mutex lock/unlock, ancient history at the failure point.

As you've probably already surmised given it isn't the ww_mutex bits,
it's the wake_q bits.  Apply the below, 5.14-rt ceases to fail.  Take
perfectly healthy 5.13-rt, apply those bits, and it instantly begins
failing as 5.14-rt had been.

---
 include/linux/sched/wake_q.h    |    7 +------
 kernel/futex.c                  |    4 ++--
 kernel/locking/rtmutex.c        |   18 +++++++++++-------
 kernel/locking/rtmutex_api.c    |    6 +++---
 kernel/locking/rtmutex_common.h |   22 +++++++++++-----------
 kernel/sched/core.c             |    4 ++--
 6 files changed, 30 insertions(+), 31 deletions(-)

--- a/include/linux/sched/wake_q.h
+++ b/include/linux/sched/wake_q.h
@@ -61,11 +61,6 @@ static inline bool wake_q_empty(struct w

 extern void wake_q_add(struct wake_q_head *head, struct task_struct *task);
 extern void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task);
-extern void __wake_up_q(struct wake_q_head *head, unsigned int state);
-
-static inline void wake_up_q(struct wake_q_head *head)
-{
-	__wake_up_q(head, TASK_NORMAL);
-}
+extern void wake_up_q(struct wake_q_head *head);

 #endif /* _LINUX_SCHED_WAKE_Q_H */
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1499,11 +1499,11 @@ static void mark_wake_futex(struct wake_
  */
 static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_state)
 {
+	DEFINE_RT_MUTEX_WAKE_Q_HEAD(wqh);
+	u32 curval, newval;
 	struct rt_mutex_waiter *top_waiter;
 	struct task_struct *new_owner;
 	bool postunlock = false;
-	DEFINE_RT_WAKE_Q(wqh);
-	u32 curval, newval;
 	int ret = 0;

 	top_waiter = rt_mutex_top_waiter(&pi_state->pi_mutex);
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -425,20 +425,24 @@ static __always_inline void rt_mutex_adj
 }

 /* RT mutex specific wake_q wrappers */
-static __always_inline void rt_mutex_wake_q_add(struct rt_wake_q_head *wqh,
+static __always_inline void rt_mutex_wake_q_add(struct rt_mutex_wake_q_head *wqh,
 						struct rt_mutex_waiter *w)
 {
 	if (IS_ENABLED(CONFIG_PREEMPT_RT) && w->wake_state != TASK_NORMAL) {
-		wake_q_add(&wqh->rt_head, w->task);
+		get_task_struct(w->task);
+		wqh->rtlock_task = w->task;
 	} else {
 		wake_q_add(&wqh->head, w->task);
 	}
 }

-static __always_inline void rt_mutex_wake_up_q(struct rt_wake_q_head *wqh)
+static __always_inline void rt_mutex_wake_up_q(struct rt_mutex_wake_q_head *wqh)
 {
-	if (IS_ENABLED(CONFIG_PREEMPT_RT) && !wake_q_empty(&wqh->rt_head))
-		__wake_up_q(&wqh->rt_head, TASK_RTLOCK_WAIT);
+	if (IS_ENABLED(CONFIG_PREEMPT_RT) && wqh->rtlock_task) {
+		wake_up_state(wqh->rtlock_task, TASK_RTLOCK_WAIT);
+		put_task_struct(wqh->rtlock_task);
+		wqh->rtlock_task = NULL;
+	}

 	if (!wake_q_empty(&wqh->head))
 		wake_up_q(&wqh->head);
@@ -1111,7 +1115,7 @@ static int __sched task_blocks_on_rt_mut
  *
  * Called with lock->wait_lock held and interrupts disabled.
  */
-static void __sched mark_wakeup_next_waiter(struct rt_wake_q_head *wqh,
+static void __sched mark_wakeup_next_waiter(struct rt_mutex_wake_q_head *wqh,
 					    struct rt_mutex_base *lock)
 {
 	struct rt_mutex_waiter *waiter;
@@ -1210,7 +1214,7 @@ static __always_inline int __rt_mutex_tr
  */
 static void __sched rt_mutex_slowunlock(struct rt_mutex_base *lock)
 {
-	DEFINE_RT_WAKE_Q(wqh);
+	DEFINE_RT_MUTEX_WAKE_Q_HEAD(wqh);
 	unsigned long flags;

 	/* irqsave required to support early boot calls */
--- a/kernel/locking/rtmutex_api.c
+++ b/kernel/locking/rtmutex_api.c
@@ -141,7 +141,7 @@ int __sched __rt_mutex_futex_trylock(str
  * @wqh:	The wake queue head from which to get the next lock waiter
  */
 bool __sched __rt_mutex_futex_unlock(struct rt_mutex_base *lock,
-				     struct rt_wake_q_head *wqh)
+				     struct rt_mutex_wake_q_head *wqh)
 {
 	lockdep_assert_held(&lock->wait_lock);

@@ -165,7 +165,7 @@ bool __sched __rt_mutex_futex_unlock(str

 void __sched rt_mutex_futex_unlock(struct rt_mutex_base *lock)
 {
-	DEFINE_RT_WAKE_Q(wqh);
+	DEFINE_RT_MUTEX_WAKE_Q_HEAD(wqh);
 	unsigned long flags;
 	bool postunlock;

@@ -454,7 +454,7 @@ void __sched rt_mutex_adjust_pi(struct t
 /*
  * Performs the wakeup of the top-waiter and re-enables preemption.
  */
-void __sched rt_mutex_postunlock(struct rt_wake_q_head *wqh)
+void __sched rt_mutex_postunlock(struct rt_mutex_wake_q_head *wqh)
 {
 	rt_mutex_wake_up_q(wqh);
 }
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -42,20 +42,20 @@ struct rt_mutex_waiter {
 };

 /**
- * rt_wake_q_head - Wrapper around regular wake_q_head to support
- *		    "sleeping" spinlocks on RT
- * @head:	The regular wake_q_head for sleeping lock variants
- * @rt_head:	The wake_q_head for RT lock (spin/rwlock) variants
+ * rt_mutex_wake_q_head - Wrapper around regular wake_q_head to support
+ *			  "sleeping" spinlocks on RT
+ * @head:		The regular wake_q_head for sleeping lock variants
+ * @rtlock_task:	Task pointer for RT lock (spin/rwlock) wakeups
  */
-struct rt_wake_q_head {
+struct rt_mutex_wake_q_head {
 	struct wake_q_head	head;
-	struct wake_q_head	rt_head;
+	struct task_struct	*rtlock_task;
 };

-#define DEFINE_RT_WAKE_Q(name)						\
-	struct rt_wake_q_head name = {					\
+#define DEFINE_RT_MUTEX_WAKE_Q_HEAD(name)				\
+	struct rt_mutex_wake_q_head name = {				\
 		.head		= WAKE_Q_HEAD_INITIALIZER(name.head),	\
-		.rt_head	= WAKE_Q_HEAD_INITIALIZER(name.rt_head),\
+		.rtlock_task	= NULL,					\
 	}

 /*
@@ -81,9 +81,9 @@ extern int __rt_mutex_futex_trylock(stru

 extern void rt_mutex_futex_unlock(struct rt_mutex_base *lock);
 extern bool __rt_mutex_futex_unlock(struct rt_mutex_base *lock,
-				struct rt_wake_q_head *wqh);
+				struct rt_mutex_wake_q_head *wqh);

-extern void rt_mutex_postunlock(struct rt_wake_q_head *wqh);
+extern void rt_mutex_postunlock(struct rt_mutex_wake_q_head *wqh);

 /*
  * Must be guarded because this header is included from rcu/tree_plugin.h
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -920,7 +920,7 @@ void wake_q_add_safe(struct wake_q_head
 		put_task_struct(task);
 }

-void __wake_up_q(struct wake_q_head *head, unsigned int state)
+void wake_up_q(struct wake_q_head *head)
 {
 	struct wake_q_node *node = head->first;

@@ -936,7 +936,7 @@ void __wake_up_q(struct wake_q_head *hea
 		 * wake_up_process() executes a full barrier, which pairs with
 		 * the queueing in wake_q_add() so as not to miss wakeups.
 		 */
-		wake_up_state(task, state);
+		wake_up_process(task);
 		put_task_struct(task);
 	}
 }



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-08-01 15:14       ` Mike Galbraith
@ 2021-08-02  7:02         ` Sebastian Andrzej Siewior
  2021-08-02  7:18           ` Mike Galbraith
  2021-08-02  9:12         ` Thomas Gleixner
  1 sibling, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-08-02  7:02 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On 2021-08-01 17:14:49 [+0200], Mike Galbraith wrote:
> On Sun, 2021-08-01 at 05:36 +0200, Mike Galbraith wrote:
> > On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote:
> > > >
> > > > First symptom is KDE/Plasma's task manager going comatose.  Notice soon
> > >
> > > KDE/Plasma points at the new fangled rtmutex based ww_mutex from
> > > Peter.
> >
> > Seems not.  When booting KVM box with nomodeset, there's exactly one
> > early boot ww_mutex lock/unlock, ancient history at the failure point.
> 
> As you've probably already surmised given it isn't the ww_mutex bits,
> it's the wake_q bits.  Apply the below, 5.14-rt ceases to fail.  Take
> perfectly healthy 5.13-rt, apply those bits, and it instantly begins
> failing as 5.14-rt had been.

Given what you have replied to the locking thread/
ww_mutex_lock_interruptible() may I assume that the wake_q bits are fine
and it is just the ww_mutex?

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-08-02  7:02         ` Sebastian Andrzej Siewior
@ 2021-08-02  7:18           ` Mike Galbraith
  2021-08-02  8:25             ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Galbraith @ 2021-08-02  7:18 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On Mon, 2021-08-02 at 09:02 +0200, Sebastian Andrzej Siewior wrote:
> On 2021-08-01 17:14:49 [+0200], Mike Galbraith wrote:
> > On Sun, 2021-08-01 at 05:36 +0200, Mike Galbraith wrote:
> > > On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote:
> > > > >
> > > > > First symptom is KDE/Plasma's task manager going comatose.  Notice soon
> > > >
> > > > KDE/Plasma points at the new fangled rtmutex based ww_mutex from
> > > > Peter.
> > >
> > > Seems not.  When booting KVM box with nomodeset, there's exactly one
> > > early boot ww_mutex lock/unlock, ancient history at the failure point.
> >
> > As you've probably already surmised given it isn't the ww_mutex bits,
> > it's the wake_q bits.  Apply the below, 5.14-rt ceases to fail.  Take
> > perfectly healthy 5.13-rt, apply those bits, and it instantly begins
> > failing as 5.14-rt had been.
>
> Given what you have replied to the locking thread/
> ww_mutex_lock_interruptible() may I assume that the wake_q bits are fine
> and it is just the ww_mutex?

Nope.  Before I even reverted the wake_q bits, I assembled a tree with
the ww_mutex changes completely removed to be absolutely certain that
they were innocent, and it indeed did retain its lost wakeup woes
despite complete loss of newfangled ww_mutex.  5.13-rt acquired those
same wakeup woes by receiving ONLY the wake_q bits, and 5.14-rt was
cured of those woes by ONLY them being reverted. I'm not seeing the
why, but those bits are either the source or the trigger of 5.14-rt
lost wakeup woes... they're toxic in some way shape fashion or form.

	-Mike


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-08-02  7:18           ` Mike Galbraith
@ 2021-08-02  8:25             ` Sebastian Andrzej Siewior
  2021-08-02  8:40               ` Mike Galbraith
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-08-02  8:25 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On 2021-08-02 09:18:34 [+0200], Mike Galbraith wrote:
> Nope.  Before I even reverted the wake_q bits, I assembled a tree with
> the ww_mutex changes completely removed to be absolutely certain that
> they were innocent, and it indeed did retain its lost wakeup woes
> despite complete loss of newfangled ww_mutex.  5.13-rt acquired those
> same wakeup woes by receiving ONLY the wake_q bits, and 5.14-rt was
> cured of those woes by ONLY them being reverted. I'm not seeing the
> why, but those bits are either the source or the trigger of 5.14-rt
> lost wakeup woes... they're toxic in some way shape fashion or form.

Okay. So the ww-mutex bits are not the cure then. All you do is booting
KDE/Plasma in kvm with virtio as GPU or did I mix up things?

> 	-Mike

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-08-02  8:25             ` Sebastian Andrzej Siewior
@ 2021-08-02  8:40               ` Mike Galbraith
  0 siblings, 0 replies; 13+ messages in thread
From: Mike Galbraith @ 2021-08-02  8:40 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

On Mon, 2021-08-02 at 10:25 +0200, Sebastian Andrzej Siewior wrote:
>
> Okay. So the ww-mutex bits are not the cure then. All you do is booting
> KDE/Plasma in kvm with virtio as GPU or did I mix up things?

My VMs are all full on clones of my desktop box, running KDE/Plasma
desktop, CPUS are setup to mirror my i4790, and they get half of box's
ram.  VMs are a "mini-me" wart hanging off the side of the real box,
NFS mounting various spots of the host where the data won't fit its
64GB virtual disk.  Display is Spice, video is QXL.

	-Mike


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: v5.14-rc3-rt1 losing wakeups?
  2021-08-01 15:14       ` Mike Galbraith
  2021-08-02  7:02         ` Sebastian Andrzej Siewior
@ 2021-08-02  9:12         ` Thomas Gleixner
  1 sibling, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2021-08-02  9:12 UTC (permalink / raw)
  To: Mike Galbraith, Sebastian Andrzej Siewior
  Cc: LKML, linux-rt-users, Steven Rostedt, Peter Zijlstra

Mike,

On Sun, Aug 01 2021 at 17:14, Mike Galbraith wrote:

> On Sun, 2021-08-01 at 05:36 +0200, Mike Galbraith wrote:
>> On Fri, 2021-07-30 at 22:49 +0200, Thomas Gleixner wrote:
>> > >
>> > > First symptom is KDE/Plasma's task manager going comatose.  Notice soon
>> >
>> > KDE/Plasma points at the new fangled rtmutex based ww_mutex from
>> > Peter.
>>
>> Seems not.  When booting KVM box with nomodeset, there's exactly one
>> early boot ww_mutex lock/unlock, ancient history at the failure point.
>
> As you've probably already surmised given it isn't the ww_mutex bits,
> it's the wake_q bits.  Apply the below, 5.14-rt ceases to fail.  Take
> perfectly healthy 5.13-rt, apply those bits, and it instantly begins
> failing as 5.14-rt had been.

now staring at it makes it pretty obvious. When I picked up Peter's
patch I thought about it briefly and then ignored my doubts :(

>  /* RT mutex specific wake_q wrappers */
> -static __always_inline void rt_mutex_wake_q_add(struct rt_wake_q_head *wqh,
> +static __always_inline void rt_mutex_wake_q_add(struct rt_mutex_wake_q_head *wqh,
>  						struct rt_mutex_waiter *w)
>  {
>  	if (IS_ENABLED(CONFIG_PREEMPT_RT) && w->wake_state != TASK_NORMAL) {
> -		wake_q_add(&wqh->rt_head, w->task);
> +		get_task_struct(w->task);
> +		wqh->rtlock_task = w->task;

This is the key. With the original asymmetric version the wake_q_add for
wake_state != TASK_NORMAL is storing the task unconditionally in
wqh->rtlock_task.

With that wake_q_add() we end up with the following situation:

Some code, e.g. futex does:

     wake_q_add(..., task)

which links task->wake_q->next is !NULL. Ergo the wake_q_add() in the
rtmutex code bails out. Same the other way round if the rtmutex side
queues first then the second - regular wakeup - will not be queued.

There's two ways to fix that:

  1) Go back to my original version

  2) Add another wake_q head to task_struct

#2 is overkill IMO simply because the rtlock wait is not subject to
multiple wakeups.

Thanks a lot Mike for tracking this down!

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-08-02  9:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-30 11:07 [ANNOUNCE] v5.14-rc3-rt1 Sebastian Andrzej Siewior
2021-07-30 15:21 ` v5.14-rc3-rt1 losing wakeups? Mike Galbraith
2021-07-30 20:49   ` Thomas Gleixner
2021-07-31  1:03     ` Mike Galbraith
2021-07-31  3:33       ` Mike Galbraith
2021-07-31  8:50         ` Mike Galbraith
2021-08-01  3:36     ` Mike Galbraith
2021-08-01 15:14       ` Mike Galbraith
2021-08-02  7:02         ` Sebastian Andrzej Siewior
2021-08-02  7:18           ` Mike Galbraith
2021-08-02  8:25             ` Sebastian Andrzej Siewior
2021-08-02  8:40               ` Mike Galbraith
2021-08-02  9:12         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox