LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [patch] queued spinlocks (i386)
@ 2007-03-25 15:54 Oleg Nesterov
  2007-03-27 15:22 ` Andi Kleen
  2007-03-28  7:04 ` Nick Piggin
  0 siblings, 2 replies; 9+ messages in thread
From: Oleg Nesterov @ 2007-03-25 15:54 UTC (permalink / raw)
  To: Nick Piggin, Ravikiran G Thirumalai, Ingo Molnar, Nikita Danilov,
	Andrew Morton
  Cc: linux-kernel

I am sorry for being completely off-topic, but I've been wondering for the
long time...

What if we replace raw_spinlock_t.slock with "struct task_struct *owner" ?

	void _spin_lock(spinlock_t *lock)
	{
		struct task_struct *owner;

		for (;;) {
			preempt_disable();
			if (likely(_raw_spin_trylock(lock)))
				break;
			preempt_enable();

			while (!spin_can_lock(lock)) {
				rcu_read_lock();
				owner = lock->owner;
				if (owner && current->prio < owner->prio &&
				    !test_tsk_thread_flag(owner, TIF_NEED_RESCHED))
					set_tsk_thread_flag(owner, TIF_NEED_RESCHED);
				rcu_read_unlock();
				cpu_relax();
			}
		}

		lock->owner = current;
	}

	void _spin_unlock(spinlock_t *lock)
	{
		lock->owner = NULL;
		_raw_spin_unlock(lock);
		preempt_enable();
	}

Now we don't need need_lockbreak(lock), need_resched() is enough, and we take
->prio into consideration.

Makes sense? Or stupid?

Oleg.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch] queued spinlocks (i386)
  2007-03-25 15:54 [patch] queued spinlocks (i386) Oleg Nesterov
@ 2007-03-27 15:22 ` Andi Kleen
  2007-03-28  7:04 ` Nick Piggin
  1 sibling, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2007-03-27 15:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Nick Piggin, Ravikiran G Thirumalai, Ingo Molnar, Nikita Danilov,
	Andrew Morton, linux-kernel

Oleg Nesterov <oleg@tv-sign.ru> writes:
> 
> Makes sense? Or stupid?

Interesting idea. Probably worth testing.

-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch] queued spinlocks (i386)
  2007-03-25 15:54 [patch] queued spinlocks (i386) Oleg Nesterov
  2007-03-27 15:22 ` Andi Kleen
@ 2007-03-28  7:04 ` Nick Piggin
  2007-03-29 18:42   ` Oleg Nesterov
  1 sibling, 1 reply; 9+ messages in thread
From: Nick Piggin @ 2007-03-28  7:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ravikiran G Thirumalai, Ingo Molnar, Nikita Danilov,
	Andrew Morton, linux-kernel

On Sun, Mar 25, 2007 at 07:54:07PM +0400, Oleg Nesterov wrote:
> I am sorry for being completely off-topic, but I've been wondering for the
> long time...
> 
> What if we replace raw_spinlock_t.slock with "struct task_struct *owner" ?
> 
> 	void _spin_lock(spinlock_t *lock)
> 	{
> 		struct task_struct *owner;
> 
> 		for (;;) {
> 			preempt_disable();
> 			if (likely(_raw_spin_trylock(lock)))
> 				break;
> 			preempt_enable();
> 
> 			while (!spin_can_lock(lock)) {
> 				rcu_read_lock();
> 				owner = lock->owner;
> 				if (owner && current->prio < owner->prio &&
> 				    !test_tsk_thread_flag(owner, TIF_NEED_RESCHED))
> 					set_tsk_thread_flag(owner, TIF_NEED_RESCHED);
> 				rcu_read_unlock();
> 				cpu_relax();
> 			}
> 		}
> 
> 		lock->owner = current;
> 	}
> 
> 	void _spin_unlock(spinlock_t *lock)
> 	{
> 		lock->owner = NULL;
> 		_raw_spin_unlock(lock);
> 		preempt_enable();
> 	}
> 
> Now we don't need need_lockbreak(lock), need_resched() is enough, and we take
> ->prio into consideration.
> 
> Makes sense? Or stupid?

Well with my queued spinlocks, all that lockbreak stuff can just come out
of the spin_lock, break_lock out of the spinlock structure, and
need_lockbreak just becomes (lock->qhead - lock->qtail > 1).



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch] queued spinlocks (i386)
  2007-03-28  7:04 ` Nick Piggin
@ 2007-03-29 18:42   ` Oleg Nesterov
  2007-03-29 22:16     ` Davide Libenzi
  2007-03-30  1:53     ` Nick Piggin
  0 siblings, 2 replies; 9+ messages in thread
From: Oleg Nesterov @ 2007-03-29 18:42 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ravikiran G Thirumalai, Ingo Molnar, Nikita Danilov,
	Andrew Morton, linux-kernel

On 03/28, Nick Piggin wrote:
>
> Well with my queued spinlocks, all that lockbreak stuff can just come out
> of the spin_lock, break_lock out of the spinlock structure, and
> need_lockbreak just becomes (lock->qhead - lock->qtail > 1).

Q: queued spinlocks are not CONFIG_PREEMPT friendly,

> +       asm volatile(LOCK_PREFIX "xaddw %0, %1\n\t"
> +                    : "+r" (pos), "+m" (lock->qhead) : : "memory");
> +       while (unlikely(pos != lock->qtail))
> +               cpu_relax();

once we incremented lock->qhead, we have no optiion but should spin with
preemption disabled until pos == lock->qtail, yes?

Oleg.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch] queued spinlocks (i386)
  2007-03-29 18:42   ` Oleg Nesterov
@ 2007-03-29 22:16     ` Davide Libenzi
  2007-03-30  2:06       ` Lee Revell
  2007-03-30  1:53     ` Nick Piggin
  1 sibling, 1 reply; 9+ messages in thread
From: Davide Libenzi @ 2007-03-29 22:16 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Nick Piggin, Ravikiran G Thirumalai, Ingo Molnar, Nikita Danilov,
	Andrew Morton, Linux Kernel Mailing List

On Thu, 29 Mar 2007, Oleg Nesterov wrote:

> On 03/28, Nick Piggin wrote:
> >
> > Well with my queued spinlocks, all that lockbreak stuff can just come out
> > of the spin_lock, break_lock out of the spinlock structure, and
> > need_lockbreak just becomes (lock->qhead - lock->qtail > 1).
> 
> Q: queued spinlocks are not CONFIG_PREEMPT friendly,

Why? Is CONFIG_PREEMPT friendly to anyone? :)



> > +       asm volatile(LOCK_PREFIX "xaddw %0, %1\n\t"
> > +                    : "+r" (pos), "+m" (lock->qhead) : : "memory");
> > +       while (unlikely(pos != lock->qtail))
> > +               cpu_relax();
> 
> once we incremented lock->qhead, we have no optiion but should spin with
> preemption disabled until pos == lock->qtail, yes?

Yes, preemption and deterministic spinlock policies are not friends.



- Davide



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch] queued spinlocks (i386)
  2007-03-29 18:42   ` Oleg Nesterov
  2007-03-29 22:16     ` Davide Libenzi
@ 2007-03-30  1:53     ` Nick Piggin
  1 sibling, 0 replies; 9+ messages in thread
From: Nick Piggin @ 2007-03-30  1:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ravikiran G Thirumalai, Ingo Molnar, Nikita Danilov,
	Andrew Morton, linux-kernel

On Thu, Mar 29, 2007 at 10:42:13PM +0400, Oleg Nesterov wrote:
> On 03/28, Nick Piggin wrote:
> >
> > Well with my queued spinlocks, all that lockbreak stuff can just come out
> > of the spin_lock, break_lock out of the spinlock structure, and
> > need_lockbreak just becomes (lock->qhead - lock->qtail > 1).
> 
> Q: queued spinlocks are not CONFIG_PREEMPT friendly,

I consider the re-enabling of preemption and interrupts to be a hack
anyway. Because if you already have interrupts or preemption disabled
at entry time, they will remain disabled.

IMO the real solution is to ensure spinlock critical sections don't get
too large, and perhaps use fair spinlocks to prevent starvation.

> 
> > +       asm volatile(LOCK_PREFIX "xaddw %0, %1\n\t"
> > +                    : "+r" (pos), "+m" (lock->qhead) : : "memory");
> > +       while (unlikely(pos != lock->qtail))
> > +               cpu_relax();
> 
> once we incremented lock->qhead, we have no optiion but should spin with
> preemption disabled until pos == lock->qtail, yes?

Correct. For the purposes of deadlock behaviour, we have effectively
taken the lock at that point.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch] queued spinlocks (i386)
  2007-03-29 22:16     ` Davide Libenzi
@ 2007-03-30  2:06       ` Lee Revell
  2007-03-30  2:17         ` Nick Piggin
  0 siblings, 1 reply; 9+ messages in thread
From: Lee Revell @ 2007-03-30  2:06 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Oleg Nesterov, Nick Piggin, Ravikiran G Thirumalai, Ingo Molnar,
	Nikita Danilov, Andrew Morton, Linux Kernel Mailing List

On 3/29/07, Davide Libenzi <davidel@xmailserver.org> wrote:
> On Thu, 29 Mar 2007, Oleg Nesterov wrote:
>
> > On 03/28, Nick Piggin wrote:
> > >
> > > Well with my queued spinlocks, all that lockbreak stuff can just come out
> > > of the spin_lock, break_lock out of the spinlock structure, and
> > > need_lockbreak just becomes (lock->qhead - lock->qtail > 1).
> >
> > Q: queued spinlocks are not CONFIG_PREEMPT friendly,
>
> Why? Is CONFIG_PREEMPT friendly to anyone? :)

Until someone fixes all the places in the kernel where scheduling can
be held off for tens of milliseconds, CONFIG_PREEMPT will be an
absolute requirement for many applications like audio and gaming.

Many of these were fixed a while back during early -rt development but
at some point the process stalled as the remaining cases were too hard
to fix...

Lee

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch] queued spinlocks (i386)
  2007-03-30  2:06       ` Lee Revell
@ 2007-03-30  2:17         ` Nick Piggin
  2007-03-30  4:44           ` Lee Revell
  0 siblings, 1 reply; 9+ messages in thread
From: Nick Piggin @ 2007-03-30  2:17 UTC (permalink / raw)
  To: Lee Revell
  Cc: Davide Libenzi, Oleg Nesterov, Ravikiran G Thirumalai,
	Ingo Molnar, Nikita Danilov, Andrew Morton,
	Linux Kernel Mailing List

On Thu, Mar 29, 2007 at 10:06:41PM -0400, Lee Revell wrote:
> On 3/29/07, Davide Libenzi <davidel@xmailserver.org> wrote:
> >On Thu, 29 Mar 2007, Oleg Nesterov wrote:
> >
> >> On 03/28, Nick Piggin wrote:
> >> >
> >> > Well with my queued spinlocks, all that lockbreak stuff can just come 
> >out
> >> > of the spin_lock, break_lock out of the spinlock structure, and
> >> > need_lockbreak just becomes (lock->qhead - lock->qtail > 1).
> >>
> >> Q: queued spinlocks are not CONFIG_PREEMPT friendly,
> >
> >Why? Is CONFIG_PREEMPT friendly to anyone? :)
> 
> Until someone fixes all the places in the kernel where scheduling can
> be held off for tens of milliseconds, CONFIG_PREEMPT will be an
> absolute requirement for many applications like audio and gaming.

There's nothing wrong with CONFIG_PREEMPT for those users. We have
a few other performance concessions activated with CONFIG_PREEMPT on.
I think a usual upper of a few miliseconds (especially for SMP) is
reasonable for a non preempt kernel.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch] queued spinlocks (i386)
  2007-03-30  2:17         ` Nick Piggin
@ 2007-03-30  4:44           ` Lee Revell
  0 siblings, 0 replies; 9+ messages in thread
From: Lee Revell @ 2007-03-30  4:44 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Davide Libenzi, Oleg Nesterov, Ravikiran G Thirumalai,
	Ingo Molnar, Nikita Danilov, Andrew Morton,
	Linux Kernel Mailing List, Eric Dumazet

On 3/29/07, Nick Piggin <npiggin@suse.de> wrote:
> On Thu, Mar 29, 2007 at 10:06:41PM -0400, Lee Revell wrote:
> > Until someone fixes all the places in the kernel where scheduling can
> > be held off for tens of milliseconds, CONFIG_PREEMPT will be an
> > absolute requirement for many applications like audio and gaming.
>
> There's nothing wrong with CONFIG_PREEMPT for those users. We have
> a few other performance concessions activated with CONFIG_PREEMPT on.
> I think a usual upper of a few miliseconds (especially for SMP) is
> reasonable for a non preempt kernel.

This is within reach - the only major offender left is
rt_secret_rebuild (and possibly other areas of the route cache
handling).  Eric Dumazet had some suggestions to fix it, but the
details are beyond my area of expertise.

Lee

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-03-30  4:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-25 15:54 [patch] queued spinlocks (i386) Oleg Nesterov
2007-03-27 15:22 ` Andi Kleen
2007-03-28  7:04 ` Nick Piggin
2007-03-29 18:42   ` Oleg Nesterov
2007-03-29 22:16     ` Davide Libenzi
2007-03-30  2:06       ` Lee Revell
2007-03-30  2:17         ` Nick Piggin
2007-03-30  4:44           ` Lee Revell
2007-03-30  1:53     ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).