LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC,PATCH] workqueues: turn queue_work() into the "barrier" for work->func()
@ 2008-11-11 20:33 Oleg Nesterov
  2008-11-11 22:46 ` David Howells
  0 siblings, 1 reply; 3+ messages in thread
From: Oleg Nesterov @ 2008-11-11 20:33 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Dmitry Torokhov, Jiri Pirko,
	Paul E. McKenney, Peter Zijlstra
  Cc: linux-kernel

To clarify, I will be happy with the "no, we don't need this" comment.

But let's suppose we have

	int VAR;

	void work_func(struct work_struct *work)
	{
		if (VAR)
			do_something();
	}

and we are doing

	VAR = 1;
	queue_work(work);

I think the caller of queue_work() has all rights to expect that
the next invocation of work_func() must see "VAR == 1", but this
is not true if the work is already pending.

	run_workqueue:

		work_clear_pending(work)
			clear_bit(WORK_STRUCT_PENDING) // no mb()
	
		call work_func()
			if (VAR)

it is possible that CPU reads VAR before before it clears _PENDING,
and queue_work() "infiltrates" in between and fails. So we can miss
an event.

I don't know if we really have such a code in kernel, and even if
we have perhaps we should fix it and do not touch workqueues. But
perhaps the current behaviour is a bit too subtle in this respect.

For example, atkbd_event_work() happens to work correctly, but only
because it does mb() implicitly.

The patch merely adds mb() after work_clear_pending(work), another
side already has the mb semantics implied by test_and_set_bit().
>From now queue_work() always acts as a barrier for work->func().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>

--- K-28/kernel/workqueue.c~WQ_MB	2008-11-06 19:11:02.000000000 +0100
+++ K-28/kernel/workqueue.c	2008-11-11 21:06:20.000000000 +0100
@@ -291,6 +291,12 @@ static void run_workqueue(struct cpu_wor
 
 		BUG_ON(get_wq_data(work) != cwq);
 		work_clear_pending(work);
+		/*
+		 * Ensure that either the concurrent queue_work() succeeds,
+		 * or work->func() sees all the preceding memory changes.
+		 */
+		smp_mb__after_clear_bit();
+
 		lock_map_acquire(&cwq->wq->lockdep_map);
 		lock_map_acquire(&lockdep_map);
 		f(work);


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC,PATCH] workqueues: turn queue_work() into the "barrier" for work->func()
  2008-11-11 20:33 [RFC,PATCH] workqueues: turn queue_work() into the "barrier" for work->func() Oleg Nesterov
@ 2008-11-11 22:46 ` David Howells
  2008-11-12 11:58   ` Oleg Nesterov
  0 siblings, 1 reply; 3+ messages in thread
From: David Howells @ 2008-11-11 22:46 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Dmitry Torokhov, Jiri Pirko,
	Paul E. McKenney, Peter Zijlstra, linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> I think the caller of queue_work() has all rights to expect that
> the next invocation of work_func() must see "VAR == 1", but this
> is not true if the work is already pending.

As you said, queue_work() does test_and_set_bit() which implies smp_mb()
either side of the function, so you're half way there, and run_workqueue()
calls spin_unlock_irq() just before calling work_clear_pending()...  So might
it make sense to move the work_clear_pending() into locked section?  Or would
that require an smp_mb__before_clear_bit()?

David

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC,PATCH] workqueues: turn queue_work() into the "barrier" for work->func()
  2008-11-11 22:46 ` David Howells
@ 2008-11-12 11:58   ` Oleg Nesterov
  0 siblings, 0 replies; 3+ messages in thread
From: Oleg Nesterov @ 2008-11-12 11:58 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Dmitry Torokhov, Jiri Pirko, Paul E. McKenney,
	Peter Zijlstra, linux-kernel

On 11/11, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > I think the caller of queue_work() has all rights to expect that
> > the next invocation of work_func() must see "VAR == 1", but this
> > is not true if the work is already pending.
>
> As you said, queue_work() does test_and_set_bit() which implies smp_mb()
> either side of the function, so you're half way there, and run_workqueue()
> calls spin_unlock_irq() just before calling work_clear_pending()...  So might
> it make sense to move the work_clear_pending() into locked section?  Or would
> that require an smp_mb__before_clear_bit()?

This can't really help, afaics. We still need mb() between clear_bit(_PENDING)
and LOAD(VAR). Because unlock() is the "one way" barrier, LOAD(VAR) can leak
into the critical section, and it can be re-ordered with clear_bit() inside
the critical section.

Oleg.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-11-12 10:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-11 20:33 [RFC,PATCH] workqueues: turn queue_work() into the "barrier" for work->func() Oleg Nesterov
2008-11-11 22:46 ` David Howells
2008-11-12 11:58   ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).