LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Markus Trippelsdorf <markus@trippelsdorf.de>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Florian Weimer <fweimer@redhat.com>,
	Darren Hart <dvhart@infradead.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Zhen Lei <thunder.leizhen@huawei.com>,
	Joe Korty <joe.korty@concurrent-rt.com>
Subject: [PATCH 4.4 10/11] futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()
Date: Sun,  8 Aug 2021 09:22:45 +0200	[thread overview]
Message-ID: <20210808072217.670617254@linuxfoundation.org> (raw)
In-Reply-To: <20210808072217.322468704@linuxfoundation.org>

From: Peter Zijlstra <peterz@infradead.org>

[ Upstream commit 04dc1b2fff4e96cb4142227fbdc63c8871ad4ed9 ]

Markus reported that the glibc/nptl/tst-robustpi8 test was failing after
commit:

  cfafcd117da0 ("futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()")

The following trace shows the problem:

 ld-linux-x86-64-2161  [019] ....   410.760971: SyS_futex: 00007ffbeb76b028: 80000875  op=FUTEX_LOCK_PI
 ld-linux-x86-64-2161  [019] ...1   410.760972: lock_pi_update_atomic: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000875 ret=0
 ld-linux-x86-64-2165  [011] ....   410.760978: SyS_futex: 00007ffbeb76b028: 80000875  op=FUTEX_UNLOCK_PI
 ld-linux-x86-64-2165  [011] d..1   410.760979: do_futex: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000871 ret=0
 ld-linux-x86-64-2165  [011] ....   410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=0000
 ld-linux-x86-64-2161  [019] ....   410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=ETIMEDOUT

Task 2165 does an UNLOCK_PI, assigning the lock to the waiter task 2161
which then returns with -ETIMEDOUT. That wrecks the lock state, because now
the owner isn't aware it acquired the lock and removes the pending robust
list entry.

If 2161 is killed, the robust list will not clear out this futex and the
subsequent acquire on this futex will then (correctly) result in -ESRCH
which is unexpected by glibc, triggers an internal assertion and dies.

Task 2161			Task 2165

rt_mutex_wait_proxy_lock()
   timeout();
   /* T2161 is still queued in  the waiter list */
   return -ETIMEDOUT;

				futex_unlock_pi()
				spin_lock(hb->lock);
				rtmutex_unlock()
				  remove_rtmutex_waiter(T2161);
				   mark_lock_available();
				/* Make the next waiter owner of the user space side */
				futex_uval = 2161;
				spin_unlock(hb->lock);
spin_lock(hb->lock);
rt_mutex_cleanup_proxy_lock()
  if (rtmutex_owner() !== current)
     ...
     return FAIL;
....
return -ETIMEOUT;

This means that rt_mutex_cleanup_proxy_lock() needs to call
try_to_take_rt_mutex() so it can take over the rtmutex correctly which was
assigned by the waker. If the rtmutex is owned by some other task then this
call is harmless and just confirmes that the waiter is not able to acquire
it.

While there, fix what looks like a merge error which resulted in
rt_mutex_cleanup_proxy_lock() having two calls to
fixup_rt_mutex_waiters() and rt_mutex_wait_proxy_lock() not having any.
Both should have one, since both potentially touch the waiter list.

Fixes: 38d589f2fd08 ("futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()")
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Bug-Spotted-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Link: http://lkml.kernel.org/r/20170519154850.mlomgdsd26drq5j6@hirez.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Joe Korty <joe.korty@concurrent-rt.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/locking/rtmutex.c |   24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1764,12 +1764,14 @@ int rt_mutex_wait_proxy_lock(struct rt_m
 	int ret;
 
 	raw_spin_lock_irq(&lock->wait_lock);
-
-	set_current_state(TASK_INTERRUPTIBLE);
-
 	/* sleep on the mutex */
+	set_current_state(TASK_INTERRUPTIBLE);
 	ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter);
-
+	/*
+	 * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
+	 * have to fix that up.
+	 */
+	fixup_rt_mutex_waiters(lock);
 	raw_spin_unlock_irq(&lock->wait_lock);
 
 	return ret;
@@ -1801,15 +1803,25 @@ bool rt_mutex_cleanup_proxy_lock(struct
 
 	raw_spin_lock_irq(&lock->wait_lock);
 	/*
+	 * Do an unconditional try-lock, this deals with the lock stealing
+	 * state where __rt_mutex_futex_unlock() -> mark_wakeup_next_waiter()
+	 * sets a NULL owner.
+	 *
+	 * We're not interested in the return value, because the subsequent
+	 * test on rt_mutex_owner() will infer that. If the trylock succeeded,
+	 * we will own the lock and it will have removed the waiter. If we
+	 * failed the trylock, we're still not owner and we need to remove
+	 * ourselves.
+	 */
+	try_to_take_rt_mutex(lock, current, waiter);
+	/*
 	 * Unless we're the owner; we're still enqueued on the wait_list.
 	 * So check if we became owner, if not, take us off the wait_list.
 	 */
 	if (rt_mutex_owner(lock) != current) {
 		remove_waiter(lock, waiter);
-		fixup_rt_mutex_waiters(lock);
 		cleanup = true;
 	}
-
 	/*
 	 * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
 	 * have to fix that up.



  parent reply	other threads:[~2021-08-08  7:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-08  7:22 [PATCH 4.4 00/11] 4.4.280-rc1 review Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 01/11] futex: Rename free_pi_state() to put_pi_state() Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 02/11] futex: Cleanup refcounting Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 03/11] futex,rt_mutex: Introduce rt_mutex_init_waiter() Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 04/11] futex: Pull rt_mutex_futex_unlock() out from under hb->lock Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 05/11] futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock() Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 06/11] futex: Futex_unlock_pi() determinism Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 07/11] rtmutex: Make wait_lock irq safe Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 08/11] futex: Handle transient "ownerless" rtmutex state correctly Greg Kroah-Hartman
2021-08-08  7:22 ` [PATCH 4.4 09/11] futex: Avoid freeing an active timer Greg Kroah-Hartman
2021-08-08  7:22 ` Greg Kroah-Hartman [this message]
2021-08-08  7:22 ` [PATCH 4.4 11/11] rcu: Update documentation of rcu_read_unlock() Greg Kroah-Hartman
2021-08-08 16:00 ` [PATCH 4.4 00/11] 4.4.280-rc1 review Guenter Roeck
2021-08-10 15:44   ` Greg Kroah-Hartman
2021-08-09 14:14 ` Naresh Kamboju
  -- strict thread matches above, loose matches on Subject: below --
2021-08-02 13:46 [PATCH 4.4 00/11] Fix a potential infinite loop in RT futex-pi scenarios Zhen Lei
2021-08-02 13:46 ` [PATCH 4.4 10/11] futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock() Zhen Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210808072217.670617254@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=dvhart@infradead.org \
    --cc=fweimer@redhat.com \
    --cc=joe.korty@concurrent-rt.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus@trippelsdorf.de \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thunder.leizhen@huawei.com \
    --subject='Re: [PATCH 4.4 10/11] futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).