LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT
@ 2021-08-25 13:27 Frederic Weisbecker
  2021-08-26 11:53 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 4+ messages in thread
From: Frederic Weisbecker @ 2021-08-25 13:27 UTC (permalink / raw)
  To: linux-rt-users
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Steven Rostedt, Mike Galbraith, Sebastian Andrzej Siewior

Hi,

Ok the patch is gross but at least this lets me start a discussion
about the issue.

---
From d9d66d650b3dac8947a34464dd2e0b546a8c6b63 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <frederic@kernel.org>
Date: Wed, 25 Aug 2021 14:24:54 +0200
Subject: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT

The eventpoll lock has been converted to an rwlock some time ago with:

	a218cc491420 (epoll: use rwlock in order to reduce ep_poll
					callback() contention)

Unfortunately this can result in scenarios where a high priority caller
of epoll_wait() need to wait for the completion of lower priority wakers.

The typical scenario is:

1) epoll_wait() waits and sleeps for new events in the ep_poll() loop.

2) new events arrive in ep_poll_callback(), the waiter is awaken while
   ep->lock is read-acquired.

3) The high priority waiter preempts the waker but it can't acquire the
   write lock in epoll_wait() so it blocks waiting for the low prio waker
   without priority inheritance.

I guess making readlock writer fair is still not the plan so all I can
propose is to make that rwlock build-conditional.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 fs/eventpoll.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 1e596e1d0bba..c1fb4b01ea4f 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1133,7 +1133,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
 	unsigned long flags;
 	int ewake = 0;
 
-	read_lock_irqsave(&ep->lock, flags);
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		read_lock_irqsave(&ep->lock, flags);
+	else
+		write_lock_irqsave(&ep->lock, flags);
 
 	ep_set_busy_poll_napi_id(epi);
 
@@ -1197,7 +1200,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
 		pwake++;
 
 out_unlock:
-	read_unlock_irqrestore(&ep->lock, flags);
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		read_unlock_irqrestore(&ep->lock, flags);
+	else
+		write_unlock_irqrestore(&ep->lock, flags);
 
 	/* We have to call this outside the lock */
 	if (pwake)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT
  2021-08-25 13:27 [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT Frederic Weisbecker
@ 2021-08-26 11:53 ` Sebastian Andrzej Siewior
  2021-08-26 20:30   ` John Ogness
  0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-08-26 11:53 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-rt-users, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Steven Rostedt, Mike Galbraith

On 2021-08-25 15:27:54 [+0200], Frederic Weisbecker wrote:
> Hi,
> 
> Ok the patch is gross but at least this lets me start a discussion
> about the issue.
> 
> ---
> From d9d66d650b3dac8947a34464dd2e0b546a8c6b63 Mon Sep 17 00:00:00 2001
> From: Frederic Weisbecker <frederic@kernel.org>
> Date: Wed, 25 Aug 2021 14:24:54 +0200
> Subject: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT
> 
> The eventpoll lock has been converted to an rwlock some time ago with:
> 
> 	a218cc491420 (epoll: use rwlock in order to reduce ep_poll
> 					callback() contention)
> 
> Unfortunately this can result in scenarios where a high priority caller
> of epoll_wait() need to wait for the completion of lower priority wakers.
> 
> The typical scenario is:
> 
> 1) epoll_wait() waits and sleeps for new events in the ep_poll() loop.
> 
> 2) new events arrive in ep_poll_callback(), the waiter is awaken while
>    ep->lock is read-acquired.
> 
> 3) The high priority waiter preempts the waker but it can't acquire the
>    write lock in epoll_wait() so it blocks waiting for the low prio waker
>    without priority inheritance.
> 
> I guess making readlock writer fair is still not the plan so all I can
> propose is to make that rwlock build-conditional.

It is writer fair in a sense that once a writer attempts to acquire the
lock no new reader are allowed in.
What you want is that the writer pi-boosts each reader which is what is
not done (multi reader boost). Long ago there was an attempt to make
this happen (I think with rwsem) but it turned out to be problematic.
There was a workaround by only allowing one reader and doing PI as
usual.
This was then dropped because multi-reader became a must have thing for
other reasons and in the meantime the lack of pi-boosting wasn't that
*problematic* anymore. The problematic user converted in the meantime to
RCU having the reading side lockless and the writer had a regular lock.

> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT
  2021-08-26 11:53 ` Sebastian Andrzej Siewior
@ 2021-08-26 20:30   ` John Ogness
  2021-08-27 10:07     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 4+ messages in thread
From: John Ogness @ 2021-08-26 20:30 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Frederic Weisbecker
  Cc: linux-rt-users, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Steven Rostedt, Mike Galbraith

On 2021-08-26, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> On 2021-08-25 15:27:54 [+0200], Frederic Weisbecker wrote:
>> Hi,
>> 
>> Ok the patch is gross but at least this lets me start a discussion
>> about the issue.
>> 
>> ---
>> From d9d66d650b3dac8947a34464dd2e0b546a8c6b63 Mon Sep 17 00:00:00 2001
>> From: Frederic Weisbecker <frederic@kernel.org>
>> Date: Wed, 25 Aug 2021 14:24:54 +0200
>> Subject: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT
>> 
>> The eventpoll lock has been converted to an rwlock some time ago with:
>> 
>> 	a218cc491420 (epoll: use rwlock in order to reduce ep_poll
>> 					callback() contention)
>> 
>> Unfortunately this can result in scenarios where a high priority caller
>> of epoll_wait() need to wait for the completion of lower priority wakers.
>> 
>> The typical scenario is:
>> 
>> 1) epoll_wait() waits and sleeps for new events in the ep_poll() loop.
>> 
>> 2) new events arrive in ep_poll_callback(), the waiter is awaken while
>>    ep->lock is read-acquired.
>> 
>> 3) The high priority waiter preempts the waker but it can't acquire the
>>    write lock in epoll_wait() so it blocks waiting for the low prio waker
>>    without priority inheritance.
>> 
>> I guess making readlock writer fair is still not the plan so all I can
>> propose is to make that rwlock build-conditional.
>
> It is writer fair in a sense that once a writer attempts to acquire
> the lock no new reader are allowed in.
>
> What you want is that the writer pi-boosts each reader which is what
> is not done (multi reader boost). Long ago there was an attempt to
> make this happen (I think with rwsem) but it turned out to be
> problematic.  There was a workaround by only allowing one reader and
> doing PI as usual.

This patch is essentially forcing that exact workaround for eventpoll.

John Ogness

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT
  2021-08-26 20:30   ` John Ogness
@ 2021-08-27 10:07     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-08-27 10:07 UTC (permalink / raw)
  To: John Ogness
  Cc: Frederic Weisbecker, linux-rt-users, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Steven Rostedt, Mike Galbraith

On 2021-08-26 22:36:04 [+0206], John Ogness wrote:
> On 2021-08-26, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> > On 2021-08-25 15:27:54 [+0200], Frederic Weisbecker wrote:
> >> I guess making readlock writer fair is still not the plan so all I can
> >> propose is to make that rwlock build-conditional.
> > It is writer fair in a sense that once a writer attempts to acquire
> > the lock no new reader are allowed in.
> >
> > What you want is that the writer pi-boosts each reader which is what
> > is not done (multi reader boost). Long ago there was an attempt to
> > make this happen (I think with rwsem) but it turned out to be
> > problematic.  There was a workaround by only allowing one reader and
> > doing PI as usual.
> 
> This patch is essentially forcing that exact workaround for eventpoll.

Frederic ended the mail with "readlock is not writer fair" so I
explained that it is and he means something else and this is not not
coming. I also suggested between the lines that he might try to move
the reader side to RCU.

> John Ogness

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-08-27 10:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-25 13:27 [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT Frederic Weisbecker
2021-08-26 11:53 ` Sebastian Andrzej Siewior
2021-08-26 20:30   ` John Ogness
2021-08-27 10:07     ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).