LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
@ 2009-05-22 18:50 Michael S. Zick
2009-05-22 19:24 ` Roland Dreier
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 18:50 UTC (permalink / raw)
To: linux-kernel
On Fri May 22 2009, you wrote:
> "Michael S. Zick" <lkml@morethan.org> writes:
>
> > Found in the bit-rot for 32-bit, x86, Uni-processor builds:
>
> Actually uni processor should not use the lock prefix
> because it doesn't need it; the only exception are some special
> ops used in para-virtualization which are special cased.
>
Unless you have interrupts enabled, then you have two contexts.
Only xchg is "naturally" atomic.
Mike
> -Andi
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 18:50 [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic Michael S. Zick
@ 2009-05-22 19:24 ` Roland Dreier
2009-05-22 20:03 ` Michael S. Zick
0 siblings, 1 reply; 65+ messages in thread
From: Roland Dreier @ 2009-05-22 19:24 UTC (permalink / raw)
To: Michael S. Zick; +Cc: linux-kernel
> Unless you have interrupts enabled, then you have two contexts.
> Only xchg is "naturally" atomic.
Isn't the lock prefix about consistency between multiple processors?
The x86 architecture always handles interrupts on instruction
boundaries. I'm guessing you're worried about definitions like
static inline void atomic_inc(atomic_t *v)
{
asm volatile(LOCK_PREFIX "incl %0"
: "+m" (v->counter));
}
which compiles to just "incl" (with no lock prefix) on uniprocessor
kernels; but the IA-32 architecture guarantees that the incl instruction
cannot be interrupted between reading the old value and writing the new
value.
- R.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 19:24 ` Roland Dreier
@ 2009-05-22 20:03 ` Michael S. Zick
0 siblings, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 20:03 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-kernel
On Fri May 22 2009, Roland Dreier wrote:
>
> > Unless you have interrupts enabled, then you have two contexts.
> > Only xchg is "naturally" atomic.
>
> Isn't the lock prefix about consistency between multiple processors?
> The x86 architecture always handles interrupts on instruction
> boundaries. I'm guessing you're worried about definitions like
>
> static inline void atomic_inc(atomic_t *v)
> {
> asm volatile(LOCK_PREFIX "incl %0"
> : "+m" (v->counter));
> }
>
> which compiles to just "incl" (with no lock prefix) on uniprocessor
> kernels; but the IA-32 architecture guarantees that the incl instruction
> cannot be interrupted between reading the old value and writing the new
> value.
>
Not prior to P-4, and since then only "may" be done atomically,
see reference post in my earlier reply.
PS: And yes, that was where I spotted the usage first. ;)
Mike
> - R.
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 6:49 ` Harald Welte
2009-05-24 12:38 ` Michael S. Zick
@ 2009-05-30 15:48 ` Michael S. Zick
1 sibling, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-30 15:48 UTC (permalink / raw)
To: Harald Welte
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sun May 24 2009, Harald Welte wrote:
> Dear hpa, and others,
>
> On Sat, May 23, 2009 at 04:44:08PM -0700, H. Peter Anvin wrote:
> > It looks like there might be a problem with the C7-M ... Michael reports
> > that if he sets LOCK_PREFIX to "lock;" it works, but that shouldn't be
> > necessary for a uniprocessor.
> >
>
> I will try my best to help, though I have to admit I'm far from being
> a x86 expert, and particularly not with regard to low-level bits such as atomic
> operations.
>
> So please give me some time to research some background about that,
> and read up all the details on the currently reported/described problem.
>
> Once I understand it in full detail, I can talk to the right people inside
> CentaurLabs (VIA's CPU division).
>
> If somebody (optionally) can phrase a precise technical question that I can
> directly forward to somebody with low-level x86 knowledge but no Linux background,
> it would definitely help speeding up the process.
>
Does the C7-M instruction set define the 'pause' instruction (0xf3,0x90)?
*Defined* since the P-4, but backward compatible with earlier ia32 processors
even though it falls into the "don't use rep before non-string instructions".
Mike
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-28 21:16 ` Pavel Machek
@ 2009-05-28 21:21 ` Michael S. Zick
0 siblings, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-28 21:21 UTC (permalink / raw)
To: Pavel Machek; +Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
On Thu May 28 2009, Pavel Machek wrote:
> Hi!
>
> > > > > I have seen some problems on via c7m based machines, where some 'smart
> > > > > bios person' implemented EC access in AML (normally, it is accessed
> > > > > from ec.c driver). Maybe you have similary bad bios?
> > > >
> > > > How to tell or distingush?
> > > > Did your looking at the dmidecode output show you that?
> > >
> > > Disassemble DSDT, and if you see strange code duplicating kernel's
> > > ec.c driver, you have similar problem...
> >
> > Someone did that but wasn't looking for "strange code" - just fixing
> > some entry size errors.
> > You can find the replacement DSDT here:
> > http://forum.netbookuser.com/viewtopic.php?pid=6512#p6512
> >
> > (Which I am not using, since it mostly cosmetic.)
>
> Ok, it does not seem to have braindead EC implementation. The DSDT
> does not look familiar, so it may be different issue. (Or it is same
> issue and we were not able to debug it due to all the BIOS problems.)
>
Thanks for taking a look,
it would have meant nothing to me.
Mike
> Pavel
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-28 20:58 ` Michael S. Zick
@ 2009-05-28 21:16 ` Pavel Machek
2009-05-28 21:21 ` Michael S. Zick
0 siblings, 1 reply; 65+ messages in thread
From: Pavel Machek @ 2009-05-28 21:16 UTC (permalink / raw)
To: Michael S. Zick
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
Hi!
> > > > I have seen some problems on via c7m based machines, where some 'smart
> > > > bios person' implemented EC access in AML (normally, it is accessed
> > > > from ec.c driver). Maybe you have similary bad bios?
> > >
> > > How to tell or distingush?
> > > Did your looking at the dmidecode output show you that?
> >
> > Disassemble DSDT, and if you see strange code duplicating kernel's
> > ec.c driver, you have similar problem...
>
> Someone did that but wasn't looking for "strange code" - just fixing
> some entry size errors.
> You can find the replacement DSDT here:
> http://forum.netbookuser.com/viewtopic.php?pid=6512#p6512
>
> (Which I am not using, since it mostly cosmetic.)
Ok, it does not seem to have braindead EC implementation. The DSDT
does not look familiar, so it may be different issue. (Or it is same
issue and we were not able to debug it due to all the BIOS problems.)
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-28 20:50 ` Pavel Machek
@ 2009-05-28 20:58 ` Michael S. Zick
2009-05-28 21:16 ` Pavel Machek
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-28 20:58 UTC (permalink / raw)
To: Pavel Machek; +Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
On Thu May 28 2009, Pavel Machek wrote:
> On Thu 2009-05-28 08:29:13, Michael S. Zick wrote:
> > On Thu May 28 2009, Pavel Machek wrote:
> > > Hi!
> > >
> > > > The observation that executing an unnecessary 'lock' opcode in some
> > > > cases slows down the machine is not felt by myself to be significant
> > > > to duplicating my observations. Note: I have been wrong before.
> > > >
> > > > This is as informative as I can make the message.
> > > >
> > > > PS: *not* a single machine failure, tested on five machines, owned
> > > > by four different people, two brands, with different use histories.
> > >
> > > I have seen some problems on via c7m based machines, where some 'smart
> > > bios person' implemented EC access in AML (normally, it is accessed
> > > from ec.c driver). Maybe you have similary bad bios?
> > >
> >
> > How to tell or distingush?
> > Did your looking at the dmidecode output show you that?
>
> Disassemble DSDT, and if you see strange code duplicating kernel's
> ec.c driver, you have similar problem...
Someone did that but wasn't looking for "strange code" - just fixing
some entry size errors.
You can find the replacement DSDT here:
http://forum.netbookuser.com/viewtopic.php?pid=6512#p6512
(Which I am not using, since it mostly cosmetic.)
Mike
> Pavel
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-28 20:30 ` Pavel Machek
@ 2009-05-28 20:54 ` Michael S. Zick
0 siblings, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-28 20:54 UTC (permalink / raw)
To: Pavel Machek
Cc: H. Peter Anvin, Harald Welte, Ingo Molnar, Thomas Gleixner,
linux-kernel, Alan Cox
On Thu May 28 2009, Pavel Machek wrote:
> Hi!
>
> > > @hpa - I still like your suggestion that it is only one (or a few)
> > > uses of atomic ops that is incorrect and in general atomic ops
> > > should compile away on uni-processor.
> > >
> >
> > Actually, the more I think about it the more I suspect there is a race
> > condition either in the chip set or in any VIA-specific drivers (if
> > there are any.) Putting LOCKs in random places will slow the CPU down
> > significantly, so it might resolve the race condition without actually
> > solving the problem.
>
> Which you can verify; replace lock with something slow (pushad,
> popad)? And see what happens.
>
> (And if it never ever triggers on hp2133, you have strong clue that it
> may not be cpu-related, but bios-related or chipset related or something).
>
> Some time ago I was trying to debug misterious hangs on some
> via/fic machines.
>
> We never figured out what was wrong, but we discovered many other bios
> bugs, and those were not being fixed; so debugging was
> hard/impossible. Unfortunately I no longer have access to that hw.
>
Then I am not losing my mind here - *it is* a difficult problem. ;)
> hp2133 did _not_ have that problem.
>
Today's build has been playing me music for over 8 hours on the
HP-2133 (C7M-CN896) but can't get past a couple of hours on the
(fic) Everex Cloudbook (C7M-CX700).
Also, the distro on the Cloudbook is using pulse-audio - the
distro on the HP is not. So I am reviewing the recent bug
fixes to kernel/futex for something over-looked. ;)
May be a wild goose chase, but I think pulse-audio uses futexes.
Thanks for the other hints.
Mike
> Try forcing maximum throttling, then move mouse for like five
> seconds. If kbc dies, you have same buggy bios, and probably are
> debugging same problem....
> Pavel
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-28 13:29 ` Michael S. Zick
@ 2009-05-28 20:50 ` Pavel Machek
2009-05-28 20:58 ` Michael S. Zick
0 siblings, 1 reply; 65+ messages in thread
From: Pavel Machek @ 2009-05-28 20:50 UTC (permalink / raw)
To: Michael S. Zick
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
On Thu 2009-05-28 08:29:13, Michael S. Zick wrote:
> On Thu May 28 2009, Pavel Machek wrote:
> > Hi!
> >
> > > The observation that executing an unnecessary 'lock' opcode in some
> > > cases slows down the machine is not felt by myself to be significant
> > > to duplicating my observations. Note: I have been wrong before.
> > >
> > > This is as informative as I can make the message.
> > >
> > > PS: *not* a single machine failure, tested on five machines, owned
> > > by four different people, two brands, with different use histories.
> >
> > I have seen some problems on via c7m based machines, where some 'smart
> > bios person' implemented EC access in AML (normally, it is accessed
> > from ec.c driver). Maybe you have similary bad bios?
> >
>
> How to tell or distingush?
> Did your looking at the dmidecode output show you that?
Disassemble DSDT, and if you see strange code duplicating kernel's
ec.c driver, you have similar problem...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 18:00 ` H. Peter Anvin
2009-05-24 18:32 ` Michael S. Zick
@ 2009-05-28 20:30 ` Pavel Machek
2009-05-28 20:54 ` Michael S. Zick
1 sibling, 1 reply; 65+ messages in thread
From: Pavel Machek @ 2009-05-28 20:30 UTC (permalink / raw)
To: H. Peter Anvin
Cc: lkml, Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Hi!
> > @hpa - I still like your suggestion that it is only one (or a few)
> > uses of atomic ops that is incorrect and in general atomic ops
> > should compile away on uni-processor.
> >
>
> Actually, the more I think about it the more I suspect there is a race
> condition either in the chip set or in any VIA-specific drivers (if
> there are any.) Putting LOCKs in random places will slow the CPU down
> significantly, so it might resolve the race condition without actually
> solving the problem.
Which you can verify; replace lock with something slow (pushad,
popad)? And see what happens.
(And if it never ever triggers on hp2133, you have strong clue that it
may not be cpu-related, but bios-related or chipset related or something).
Some time ago I was trying to debug misterious hangs on some
via/fic machines.
We never figured out what was wrong, but we discovered many other bios
bugs, and those were not being fixed; so debugging was
hard/impossible. Unfortunately I no longer have access to that hw.
hp2133 did _not_ have that problem.
Try forcing maximum throttling, then move mouse for like five
seconds. If kbc dies, you have same buggy bios, and probably are
debugging same problem....
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-28 12:48 ` Pavel Machek
@ 2009-05-28 13:29 ` Michael S. Zick
2009-05-28 20:50 ` Pavel Machek
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-28 13:29 UTC (permalink / raw)
To: Pavel Machek; +Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
On Thu May 28 2009, Pavel Machek wrote:
> Hi!
>
> > The observation that executing an unnecessary 'lock' opcode in some
> > cases slows down the machine is not felt by myself to be significant
> > to duplicating my observations. Note: I have been wrong before.
> >
> > This is as informative as I can make the message.
> >
> > PS: *not* a single machine failure, tested on five machines, owned
> > by four different people, two brands, with different use histories.
>
> I have seen some problems on via c7m based machines, where some 'smart
> bios person' implemented EC access in AML (normally, it is accessed
> from ec.c driver). Maybe you have similary bad bios?
>
How to tell or distingush?
Did your looking at the dmidecode output show you that?
Mike
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 22:21 ` Michael S. Zick
2009-05-22 23:30 ` H. Peter Anvin
@ 2009-05-28 12:48 ` Pavel Machek
2009-05-28 13:29 ` Michael S. Zick
1 sibling, 1 reply; 65+ messages in thread
From: Pavel Machek @ 2009-05-28 12:48 UTC (permalink / raw)
To: Michael S. Zick
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
Hi!
> The observation that executing an unnecessary 'lock' opcode in some
> cases slows down the machine is not felt by myself to be significant
> to duplicating my observations. Note: I have been wrong before.
>
> This is as informative as I can make the message.
>
> PS: *not* a single machine failure, tested on five machines, owned
> by four different people, two brands, with different use histories.
I have seen some problems on via c7m based machines, where some 'smart
bios person' implemented EC access in AML (normally, it is accessed
from ec.c driver). Maybe you have similary bad bios?
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-27 22:13 ` Roland Dreier
@ 2009-05-27 22:33 ` Michael S. Zick
0 siblings, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-27 22:33 UTC (permalink / raw)
To: Roland Dreier
Cc: peterz, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
On Wed May 27 2009, Roland Dreier wrote:
>
> > The only objective information is posted here:
> > http://lkml.org/lkml/2009/5/20/342
>
> Not sure if you've looked at this, but it's a lockdep trace that looks
> to be a valid lockdep report due to non-annotated code (I don't *think*
> it's a bug). To summarize, there is the code path in
> kernel/irq/spurious.c that does:
>
I haven't looked at it - beyond my skill level.
Still trying to deal with a machine where the only symptom is a deadlock.
So I post these for someone else's eyes until I figure out the deadlock.
Mike
> poll_spurious_irq_timer ->
> poll_spurious_irqs() [from timer, with hard IRQs on] ->
> poll_all_shared_irqs() [if we think an IRQ got stuck] ->
> try_one_irq() ->
> spin_lock(&desc->lock) [as above -- hard IRQs on]
>
> while kernel/irq/chip.c has:
>
> handle_level_irq() [called with hard IRQs off] ->
> spin_lock(&desc->lock) [as above -- hard IRQs off]
>
> and lockdep can't tell that the interrupt corresponding to desc has been
> disabled if we ever actually reach try_one_irq(), so there's no risk of
> the interrupt coming in and deadlocking while the try_one_irq() code
> holds desc->lock.
>
> Unfortunately I don't know how to annotate this...
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 10:44 ` Michael S. Zick
2009-05-23 11:18 ` Michael S. Zick
2009-05-24 7:04 ` Harald Welte
@ 2009-05-27 22:13 ` Roland Dreier
2009-05-27 22:33 ` Michael S. Zick
2 siblings, 1 reply; 65+ messages in thread
From: Roland Dreier @ 2009-05-27 22:13 UTC (permalink / raw)
To: peterz; +Cc: lkml, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
> The only objective information is posted here:
> http://lkml.org/lkml/2009/5/20/342
Not sure if you've looked at this, but it's a lockdep trace that looks
to be a valid lockdep report due to non-annotated code (I don't *think*
it's a bug). To summarize, there is the code path in
kernel/irq/spurious.c that does:
poll_spurious_irq_timer ->
poll_spurious_irqs() [from timer, with hard IRQs on] ->
poll_all_shared_irqs() [if we think an IRQ got stuck] ->
try_one_irq() ->
spin_lock(&desc->lock) [as above -- hard IRQs on]
while kernel/irq/chip.c has:
handle_level_irq() [called with hard IRQs off] ->
spin_lock(&desc->lock) [as above -- hard IRQs off]
and lockdep can't tell that the interrupt corresponding to desc has been
disabled if we ever actually reach try_one_irq(), so there's no risk of
the interrupt coming in and deadlocking while the try_one_irq() code
holds desc->lock.
Unfortunately I don't know how to annotate this...
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-26 12:37 ` Michael S. Zick
@ 2009-05-26 17:13 ` H. Peter Anvin
0 siblings, 0 replies; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-26 17:13 UTC (permalink / raw)
To: lkml; +Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Michael S. Zick wrote:
>
> Disassembly of section .text:
>
> 00000000 <diff_umask>:
> 0: 8b 44 24 0c mov 0xc(%esp),%eax
> 4: 8b 4c 24 04 mov 0x4(%esp),%ecx
> 8: 8b 10 mov (%eax),%edx
> a: 8d 04 11 lea (%ecx,%edx,1),%eax
> d: 8b 54 24 08 mov 0x8(%esp),%edx
> 11: 2b 02 sub (%edx),%eax
> 13: 21 c8 and %ecx,%eax
> 15: c3 ret
>
> = = = =
>
> Checking the byte string 0x8d, 0x04, 0x11 against the Intel
> documentation shows that the disassembly output of objdump
> is incorrect - that bit string does not have an offset field.
> That is the byte encoding for the gcc assembly input.
>
> What's a person to do when the tool-chain lies?
>
The ,1 isn't an offset field... it's a scale factor.
-hpa
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-26 0:05 ` H. Peter Anvin
@ 2009-05-26 12:37 ` Michael S. Zick
2009-05-26 17:13 ` H. Peter Anvin
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-26 12:37 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Mon May 25 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> >
> > Load Effective Address does two's complement arithmetic?
> > I'll take your word for it.
> >
>
> LEA, and all other address calculations use 2's-complement arithmetic:
>
> leal -1(%ebx),%eax
> leal 0xffffffff(%ebx),%eax
>
> ... is the same instruction.
>
> However, gcc has been known to optimize out range checks when operating
> on signed integers; it is allowed to do this by the C standard, but it
> can give surprising results if the user expected wraparound.
>
Well, it isn't a range check - - but this illustrates where my (false)
concern came from:
Given this input file:
extern int diff_umask(int mask, int *cnt1, int *cnt2)
{ return (((mask - *cnt1) + *cnt2) & mask); }
Doing:
gcc -O2 -S -fomit-frame-pointer difftest.c
Yields (as difftest.s):
.file "difftest.c"
.text
.p2align 4,,15
.globl diff_umask
.type diff_umask, @function
diff_umask:
movl 12(%esp), %eax
movl 4(%esp), %ecx
movl (%eax), %edx
leal (%ecx,%edx), %eax
movl 8(%esp), %edx
subl (%edx), %eax
andl %ecx, %eax
ret
.size diff_umask, .-diff_umask
.ident "GCC: (Debian 4.3.2-1.1) 4.3.2"
.section .note.GNU-stack,"",@progbits
How follow that up with the commands:
gcc -O2 -c -fomit-frame-pointer difftest.s
Then examine the result with objdump:
objdump -d difftest.o
In relevant part, yields:
difftest.o: file format elf32-i386
Disassembly of section .text:
00000000 <diff_umask>:
0: 8b 44 24 0c mov 0xc(%esp),%eax
4: 8b 4c 24 04 mov 0x4(%esp),%ecx
8: 8b 10 mov (%eax),%edx
a: 8d 04 11 lea (%ecx,%edx,1),%eax
d: 8b 54 24 08 mov 0x8(%esp),%edx
11: 2b 02 sub (%edx),%eax
13: 21 c8 and %ecx,%eax
15: c3 ret
= = = =
Checking the byte string 0x8d, 0x04, 0x11 against the Intel
documentation shows that the disassembly output of objdump
is incorrect - that bit string does not have an offset field.
That is the byte encoding for the gcc assembly input.
What's a person to do when the tool-chain lies?
Mike
> -hpa
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-25 23:03 ` Michael S. Zick
2009-05-25 23:35 ` Michael S. Zick
@ 2009-05-26 0:05 ` H. Peter Anvin
2009-05-26 12:37 ` Michael S. Zick
1 sibling, 1 reply; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-26 0:05 UTC (permalink / raw)
To: lkml; +Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Michael S. Zick wrote:
>
> Load Effective Address does two's complement arithmetic?
> I'll take your word for it.
>
LEA, and all other address calculations use 2's-complement arithmetic:
leal -1(%ebx),%eax
leal 0xffffffff(%ebx),%eax
... is the same instruction.
However, gcc has been known to optimize out range checks when operating
on signed integers; it is allowed to do this by the C standard, but it
can give surprising results if the user expected wraparound.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-25 23:03 ` Michael S. Zick
@ 2009-05-25 23:35 ` Michael S. Zick
2009-05-26 0:05 ` H. Peter Anvin
1 sibling, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-25 23:35 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Mon May 25 2009, Michael S. Zick wrote:
PS: gcc-4.1.2 does compile the function the same
within the main file or as a stand-alone file.
Along with maintaining the programmer specified
order of operations without trying to hardcode
corrections to LEA.
I'll stick with 4.1.2 myself. YMMV
Mike
> On Mon May 25 2009, H. Peter Anvin wrote:
> > Michael S. Zick wrote:
> > > On Mon May 25 2009, Michael S. Zick wrote:
> > >
> > > In actual application, this *should not* make a difference.
> > >
> >
> > No kidding. This is a valid transformation for integers, since it is
> > all done with 2's-complement arithmetic.
> >
>
> Load Effective Address does two's complement arithmetic?
> I'll take your word for it.
>
> For example:
>
> #include <stdio.h>
>
> extern int diff_umask(int mask, int *cnt1, int *cnt2)
> { return (((mask - *cnt1) + *cnt2) & mask); }
>
> int main() {
> int msk = 0x7fffffff; /* max positive */
> int idx1 = 0x7ffffffd; /* max positive - 2 */
> int idx2 = 0x7fffffff; /* max positive */
>
> int rst;
>
> rst = diff_umask(msk, &idx1, &idx2);
> printf("\n\t%d\n", rst); /* " 1 " - correct */
> }
>
> But that is because when it is compiled as a
> single source file, gcc is hardcoding the lea
> adjustment when it is not an external file:
> (compare to the above listings)
> Like I wrote - I don't use 31-bit ring buffers, so I don't care.
>
> objdump -d testdiff:
> - - - snip - - -
> 080483b0 <diff_umask>:
> 80483b0: 8b 44 24 0c mov 0xc(%esp),%eax
> 80483b4: 8b 4c 24 04 mov 0x4(%esp),%ecx
> 80483b8: 8b 10 mov (%eax),%edx
> 80483ba: 8d 04 11 lea (%ecx,%edx,1),%eax
> 80483bd: 8b 54 24 08 mov 0x8(%esp),%edx
> 80483c1: 2b 02 sub (%edx),%eax
> 80483c3: 21 c8 and %ecx,%eax
> 80483c5: c3 ret
> - - - snip - - -
>
> Mike
>
> > Floating-point numbers is a whole other game.
> >
> > -hpa
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-25 21:17 ` H. Peter Anvin
@ 2009-05-25 23:03 ` Michael S. Zick
2009-05-25 23:35 ` Michael S. Zick
2009-05-26 0:05 ` H. Peter Anvin
0 siblings, 2 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-25 23:03 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Mon May 25 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> > On Mon May 25 2009, Michael S. Zick wrote:
> >
> > In actual application, this *should not* make a difference.
> >
>
> No kidding. This is a valid transformation for integers, since it is
> all done with 2's-complement arithmetic.
>
Load Effective Address does two's complement arithmetic?
I'll take your word for it.
For example:
#include <stdio.h>
extern int diff_umask(int mask, int *cnt1, int *cnt2)
{ return (((mask - *cnt1) + *cnt2) & mask); }
int main() {
int msk = 0x7fffffff; /* max positive */
int idx1 = 0x7ffffffd; /* max positive - 2 */
int idx2 = 0x7fffffff; /* max positive */
int rst;
rst = diff_umask(msk, &idx1, &idx2);
printf("\n\t%d\n", rst); /* " 1 " - correct */
}
But that is because when it is compiled as a
single source file, gcc is hardcoding the lea
adjustment when it is not an external file:
(compare to the above listings)
Like I wrote - I don't use 31-bit ring buffers, so I don't care.
objdump -d testdiff:
- - - snip - - -
080483b0 <diff_umask>:
80483b0: 8b 44 24 0c mov 0xc(%esp),%eax
80483b4: 8b 4c 24 04 mov 0x4(%esp),%ecx
80483b8: 8b 10 mov (%eax),%edx
80483ba: 8d 04 11 lea (%ecx,%edx,1),%eax
80483bd: 8b 54 24 08 mov 0x8(%esp),%edx
80483c1: 2b 02 sub (%edx),%eax
80483c3: 21 c8 and %ecx,%eax
80483c5: c3 ret
- - - snip - - -
Mike
> Floating-point numbers is a whole other game.
>
> -hpa
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-25 21:10 ` Michael S. Zick
@ 2009-05-25 21:17 ` H. Peter Anvin
2009-05-25 23:03 ` Michael S. Zick
0 siblings, 1 reply; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-25 21:17 UTC (permalink / raw)
To: lkml; +Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Michael S. Zick wrote:
> On Mon May 25 2009, Michael S. Zick wrote:
>
> In actual application, this *should not* make a difference.
>
No kidding. This is a valid transformation for integers, since it is
all done with 2's-complement arithmetic.
Floating-point numbers is a whole other game.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-25 19:46 ` Michael S. Zick
@ 2009-05-25 21:10 ` Michael S. Zick
2009-05-25 21:17 ` H. Peter Anvin
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-25 21:10 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Mon May 25 2009, Michael S. Zick wrote:
In actual application, this *should not* make a difference.
Mike
> On Mon May 25 2009, Michael S. Zick wrote:
> > On Mon May 25 2009, Michael S. Zick wrote:
> > > On Sun May 24 2009, H. Peter Anvin wrote:
> > > > Michael S. Zick wrote:
> > > > >
> > > > > Note: I have seem to recall that newer gcc's optimizer presume
> > > > > that the flags register is preserved across asm -
> > > > > It didn't use to do that - but there is now a "cc" to deal with
> > > > > that - Have not yet audited for that, but it is high on my list.
> > > > >
> > > >
> > > > I am pretty sure that's false... if it was true we'd have failures all
> > > > over the kernel.
> > > >
> > >
> > > No information on the above (yet) - but you gotta love this one: ;)
> > >
> > > Programmer authors code specifying that the subtraction be done
> > > prior to the addition to avoid over-flow conditions;
> > >
> > > GCC's optimizer, in its great wisdom, codes in the overflow case:
> > > ( the case of finding the characters used/free in a ring buffer )
> > >
> > > extern int diff_umask(int mask, int *cnt1, int *cnt2)
> > > { return (((mask - *cnt1) + *cnt2) & mask); }
> > >
> > > /**
> > > * gcc -O2 -S -fomit-frame-pointer difftest.c
> > > *
> > > .file "difftest.c"
> > > .text
> > > .p2align 4,,15
> > > .globl diff_umask
> > > .type diff_umask, @function
> > > diff_umask:
> > > movl 12(%esp), %eax
> > > movl 4(%esp), %ecx
> > > movl (%eax), %edx
> > > leal (%ecx,%edx), %eax
> > > movl 8(%esp), %edx
> > > subl (%edx), %eax
> > > andl %ecx, %eax
> > > ret
> > > .size diff_umask, .-diff_umask
> > > .ident "GCC: (Debian 4.3.2-1.1) 4.3.2"
> > > .section .note.GNU-stack,"",@progbits
> > > */
> > >
> > > Note: That is not the compiler version I am building my kernels with.
> > >
> >
> > The compiler I am using (Gentoo 4.1.2) gets it correct:
> >
> > .file "difftest.c"
> > .text
> > .p2align 4,,15
> > .globl diff_umask
> > .type diff_umask, @function
> > diff_umask:
> > movl 4(%esp), %eax
> > movl 8(%esp), %edx
> > movl %eax, %ecx
> > subl (%edx), %ecx
> > movl %ecx, %edx
> > movl 12(%esp), %ecx
> > addl (%ecx), %edx
> > andl %edx, %eax
> > ret
> > .size diff_umask, .-diff_umask
> > .ident "GCC: (GNU) 4.1.2 (Gentoo 4.1.2 p1.1)"
> > .section .note.GNU-stack,"",@progbits
> >
>
> Gentoo's current 4.3 gets it wrong also:
>
> .file "difftest.c"
> .text
> .p2align 4,,15
> .globl diff_umask
> .type diff_umask, @function
> diff_umask:
> movl 12(%esp), %eax
> movl 4(%esp), %ecx
> movl (%eax), %edx
> leal (%ecx,%edx), %eax
> movl 8(%esp), %edx
> subl (%edx), %eax
> andl %ecx, %eax
> ret
> .size diff_umask, .-diff_umask
> .ident "GCC: (Gentoo 4.3.2-r3 p1.6, pie-10.1.5) 4.3.2"
> .section .note.GNU-stack,"",@progbits
>
> = = = =
>
> Might be time to put compiler version checking back into the
> build system and/or re-test the sources that do have version
> checking in them (hint: the boss's code).
>
> Mike
> > Mike
> > > Don't blame me, I didn't write the compiler. ;)
> > >
> > > Mike
> > > > -hpa
> > > >
> > >
> > >
> > > --
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-25 19:18 ` Michael S. Zick
@ 2009-05-25 19:46 ` Michael S. Zick
2009-05-25 21:10 ` Michael S. Zick
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-25 19:46 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Mon May 25 2009, Michael S. Zick wrote:
> On Mon May 25 2009, Michael S. Zick wrote:
> > On Sun May 24 2009, H. Peter Anvin wrote:
> > > Michael S. Zick wrote:
> > > >
> > > > Note: I have seem to recall that newer gcc's optimizer presume
> > > > that the flags register is preserved across asm -
> > > > It didn't use to do that - but there is now a "cc" to deal with
> > > > that - Have not yet audited for that, but it is high on my list.
> > > >
> > >
> > > I am pretty sure that's false... if it was true we'd have failures all
> > > over the kernel.
> > >
> >
> > No information on the above (yet) - but you gotta love this one: ;)
> >
> > Programmer authors code specifying that the subtraction be done
> > prior to the addition to avoid over-flow conditions;
> >
> > GCC's optimizer, in its great wisdom, codes in the overflow case:
> > ( the case of finding the characters used/free in a ring buffer )
> >
> > extern int diff_umask(int mask, int *cnt1, int *cnt2)
> > { return (((mask - *cnt1) + *cnt2) & mask); }
> >
> > /**
> > * gcc -O2 -S -fomit-frame-pointer difftest.c
> > *
> > .file "difftest.c"
> > .text
> > .p2align 4,,15
> > .globl diff_umask
> > .type diff_umask, @function
> > diff_umask:
> > movl 12(%esp), %eax
> > movl 4(%esp), %ecx
> > movl (%eax), %edx
> > leal (%ecx,%edx), %eax
> > movl 8(%esp), %edx
> > subl (%edx), %eax
> > andl %ecx, %eax
> > ret
> > .size diff_umask, .-diff_umask
> > .ident "GCC: (Debian 4.3.2-1.1) 4.3.2"
> > .section .note.GNU-stack,"",@progbits
> > */
> >
> > Note: That is not the compiler version I am building my kernels with.
> >
>
> The compiler I am using (Gentoo 4.1.2) gets it correct:
>
> .file "difftest.c"
> .text
> .p2align 4,,15
> .globl diff_umask
> .type diff_umask, @function
> diff_umask:
> movl 4(%esp), %eax
> movl 8(%esp), %edx
> movl %eax, %ecx
> subl (%edx), %ecx
> movl %ecx, %edx
> movl 12(%esp), %ecx
> addl (%ecx), %edx
> andl %edx, %eax
> ret
> .size diff_umask, .-diff_umask
> .ident "GCC: (GNU) 4.1.2 (Gentoo 4.1.2 p1.1)"
> .section .note.GNU-stack,"",@progbits
>
Gentoo's current 4.3 gets it wrong also:
.file "difftest.c"
.text
.p2align 4,,15
.globl diff_umask
.type diff_umask, @function
diff_umask:
movl 12(%esp), %eax
movl 4(%esp), %ecx
movl (%eax), %edx
leal (%ecx,%edx), %eax
movl 8(%esp), %edx
subl (%edx), %eax
andl %ecx, %eax
ret
.size diff_umask, .-diff_umask
.ident "GCC: (Gentoo 4.3.2-r3 p1.6, pie-10.1.5) 4.3.2"
.section .note.GNU-stack,"",@progbits
= = = =
Might be time to put compiler version checking back into the
build system and/or re-test the sources that do have version
checking in them (hint: the boss's code).
Mike
> Mike
> > Don't blame me, I didn't write the compiler. ;)
> >
> > Mike
> > > -hpa
> > >
> >
> >
> > --
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-25 19:03 ` Michael S. Zick
@ 2009-05-25 19:18 ` Michael S. Zick
2009-05-25 19:46 ` Michael S. Zick
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-25 19:18 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Mon May 25 2009, Michael S. Zick wrote:
> On Sun May 24 2009, H. Peter Anvin wrote:
> > Michael S. Zick wrote:
> > >
> > > Note: I have seem to recall that newer gcc's optimizer presume
> > > that the flags register is preserved across asm -
> > > It didn't use to do that - but there is now a "cc" to deal with
> > > that - Have not yet audited for that, but it is high on my list.
> > >
> >
> > I am pretty sure that's false... if it was true we'd have failures all
> > over the kernel.
> >
>
> No information on the above (yet) - but you gotta love this one: ;)
>
> Programmer authors code specifying that the subtraction be done
> prior to the addition to avoid over-flow conditions;
>
> GCC's optimizer, in its great wisdom, codes in the overflow case:
> ( the case of finding the characters used/free in a ring buffer )
>
> extern int diff_umask(int mask, int *cnt1, int *cnt2)
> { return (((mask - *cnt1) + *cnt2) & mask); }
>
> /**
> * gcc -O2 -S -fomit-frame-pointer difftest.c
> *
> .file "difftest.c"
> .text
> .p2align 4,,15
> .globl diff_umask
> .type diff_umask, @function
> diff_umask:
> movl 12(%esp), %eax
> movl 4(%esp), %ecx
> movl (%eax), %edx
> leal (%ecx,%edx), %eax
> movl 8(%esp), %edx
> subl (%edx), %eax
> andl %ecx, %eax
> ret
> .size diff_umask, .-diff_umask
> .ident "GCC: (Debian 4.3.2-1.1) 4.3.2"
> .section .note.GNU-stack,"",@progbits
> */
>
> Note: That is not the compiler version I am building my kernels with.
>
The compiler I am using (Gentoo 4.1.2) gets it correct:
.file "difftest.c"
.text
.p2align 4,,15
.globl diff_umask
.type diff_umask, @function
diff_umask:
movl 4(%esp), %eax
movl 8(%esp), %edx
movl %eax, %ecx
subl (%edx), %ecx
movl %ecx, %edx
movl 12(%esp), %ecx
addl (%ecx), %edx
andl %edx, %eax
ret
.size diff_umask, .-diff_umask
.ident "GCC: (GNU) 4.1.2 (Gentoo 4.1.2 p1.1)"
.section .note.GNU-stack,"",@progbits
Mike
> Don't blame me, I didn't write the compiler. ;)
>
> Mike
> > -hpa
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 18:46 ` H. Peter Anvin
2009-05-24 19:09 ` Michael S. Zick
@ 2009-05-25 19:03 ` Michael S. Zick
2009-05-25 19:18 ` Michael S. Zick
1 sibling, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-25 19:03 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sun May 24 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> >
> > Note: I have seem to recall that newer gcc's optimizer presume
> > that the flags register is preserved across asm -
> > It didn't use to do that - but there is now a "cc" to deal with
> > that - Have not yet audited for that, but it is high on my list.
> >
>
> I am pretty sure that's false... if it was true we'd have failures all
> over the kernel.
>
No information on the above (yet) - but you gotta love this one: ;)
Programmer authors code specifying that the subtraction be done
prior to the addition to avoid over-flow conditions;
GCC's optimizer, in its great wisdom, codes in the overflow case:
( the case of finding the characters used/free in a ring buffer )
extern int diff_umask(int mask, int *cnt1, int *cnt2)
{ return (((mask - *cnt1) + *cnt2) & mask); }
/**
* gcc -O2 -S -fomit-frame-pointer difftest.c
*
.file "difftest.c"
.text
.p2align 4,,15
.globl diff_umask
.type diff_umask, @function
diff_umask:
movl 12(%esp), %eax
movl 4(%esp), %ecx
movl (%eax), %edx
leal (%ecx,%edx), %eax
movl 8(%esp), %edx
subl (%edx), %eax
andl %ecx, %eax
ret
.size diff_umask, .-diff_umask
.ident "GCC: (Debian 4.3.2-1.1) 4.3.2"
.section .note.GNU-stack,"",@progbits
*/
Note: That is not the compiler version I am building my kernels with.
Don't blame me, I didn't write the compiler. ;)
Mike
> -hpa
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 18:32 ` Michael S. Zick
2009-05-24 18:46 ` H. Peter Anvin
@ 2009-05-25 16:05 ` Michael S. Zick
1 sibling, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-25 16:05 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sun May 24 2009, Michael S. Zick wrote:
> On Sun May 24 2009, H. Peter Anvin wrote:
> > Michael S. Zick wrote:
> > >
> > > @hpa - I still like your suggestion that it is only one (or a few)
> > > uses of atomic ops that is incorrect and in general atomic ops
> > > should compile away on uni-processor.
> > >
> >
> > Actually, the more I think about it the more I suspect there is a race
> > condition either in the chip set or in any VIA-specific drivers (if
> > there are any.) Putting LOCKs in random places will slow the CPU down
> > significantly, so it might resolve the race condition without actually
> > solving the problem.
> >
>
> They are mostly out of the -09143 and -09144 builds -
> No cpufreq (I.E: no e_powersaver).
> The padlock-* drivers are modules which must be manually loaded.
>
> The i2c-viapro driver (in spite of its comments) does not work
> on CX700 (written before manual was released) - it is reading
> the serial number rather than the second data port. ;)
> (No access to the chipset temperature/voltage data on SMBus).
>
> The via-fb driver just "doesn't work" - Haven't looked at it yet.
>
> There is a VIA-specific driver for the VIA USB controller, but it
> isn't in the x86 part of the tree - Haven't looked at it yet.
>
> There isn't a driver for the hardware watchdog on CX700 -
> There isn't a driver for the machine error reporting -
>
> = = = =
>
> Although there may be timing requirement differences on the
> CX700 and CN896 - I think more likely a human error (typo)
> in the "clobber" lines of the asm - Have not yet audited that,
> but it is high on my list.
>
> Note: I have seem to recall that newer gcc's optimizer presume
> that the flags register is preserved across asm -
> It didn't use to do that - but there is now a "cc" to deal with
> that - Have not yet audited for that, but it is high on my list.
>
> Busy, busy, busy - -
> The -09144lk on C7-M/CX700 now up for 3 3/4 hours close to a new
> record - but ehci-hcd has not yet gone into a re-try loop.
>
The -09145{,lk}-db pair is posted now.
Same code-base/config as the -09144{,lk} pair with the addition
of lockdep checking.
Details:
http://forum.netbookuser.com/viewtopic.php?pid=6980#p6980
Mike
> Mike
> > -hpa
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 18:46 ` H. Peter Anvin
@ 2009-05-24 19:09 ` Michael S. Zick
2009-05-25 19:03 ` Michael S. Zick
1 sibling, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-24 19:09 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sun May 24 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> >
> > Note: I have seem to recall that newer gcc's optimizer presume
> > that the flags register is preserved across asm -
> > It didn't use to do that - but there is now a "cc" to deal with
> > that - Have not yet audited for that, but it is high on my list.
> >
>
> I am pretty sure that's false... if it was true we'd have failures all
> over the kernel.
>
Not an issue at the moment, will cover that when I audit my own code.
Mike
> -hpa
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 20:32 ` Michael S. Zick
` (2 preceding siblings ...)
2009-05-22 20:45 ` Roland Dreier
@ 2009-05-24 18:59 ` Robert Hancock
3 siblings, 0 replies; 65+ messages in thread
From: Robert Hancock @ 2009-05-24 18:59 UTC (permalink / raw)
To: lkml; +Cc: Samuel Thibault, Andi Kleen, linux-kernel
Michael S. Zick wrote:
> On Fri May 22 2009, Samuel Thibault wrote:
>> Michael S. Zick, le Fri 22 May 2009 14:53:39 -0500, a écrit :
>>> Ref: http://developer.intel.com/Assets/PDF/manual/253666.pdf
>>> Manual page: 3-590 PDF page: 638
>>> Summary: Processors prior to P-4 can take an interrupt between
>>> the read cycle and the write cycle. Which is why opcode 0xF0 exists.
>> Where do you see page 638/639 talking about interrupts? It talks about
>> multi-processor machines.
>>
>
> No - it talks about "exclusive memory access" - You got bus master DMA
> in your test machine? You also have an older than P-4 single processor?
It means that LOCK is required in multi-processor environment to ensure
that an instruction executes atomically WRT memory operations being done
on other CPUs. On a single processor, except for some weird exceptions
(like rep instructions, which can't be LOCKed anyways), instructions are
always atomic with respect to interrupts.
>
> Look people, I just reported what I found from testing -
> Please don't shoot the messanger.
>
> If it: "Does not make a difference" then it "Should not make a difference"
> but it does, try it yourself. Its safe (if LOCK_PREFIX is in the proper
> places) - the machine will ignore the opcode if is recent enough to not
> need it - just trust the cpu's micro-code.
What do you mean "recent enough to not need it?" There is no such thing.
On any x86 machine it does something. It will slow things down, and
there is no reason it should be required on uni-processor systems.
Quite likely that's the only effect adding the LOCK prefix is having,
slowing things down, and covering up whatever is causing your issue,
without having anything to do with the root cause.
>
> Mike
>> Samuel
>>
>>
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 18:32 ` Michael S. Zick
@ 2009-05-24 18:46 ` H. Peter Anvin
2009-05-24 19:09 ` Michael S. Zick
2009-05-25 19:03 ` Michael S. Zick
2009-05-25 16:05 ` Michael S. Zick
1 sibling, 2 replies; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-24 18:46 UTC (permalink / raw)
To: lkml; +Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Michael S. Zick wrote:
>
> Note: I have seem to recall that newer gcc's optimizer presume
> that the flags register is preserved across asm -
> It didn't use to do that - but there is now a "cc" to deal with
> that - Have not yet audited for that, but it is high on my list.
>
I am pretty sure that's false... if it was true we'd have failures all
over the kernel.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 18:00 ` H. Peter Anvin
@ 2009-05-24 18:32 ` Michael S. Zick
2009-05-24 18:46 ` H. Peter Anvin
2009-05-25 16:05 ` Michael S. Zick
2009-05-28 20:30 ` Pavel Machek
1 sibling, 2 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-24 18:32 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sun May 24 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> >
> > @hpa - I still like your suggestion that it is only one (or a few)
> > uses of atomic ops that is incorrect and in general atomic ops
> > should compile away on uni-processor.
> >
>
> Actually, the more I think about it the more I suspect there is a race
> condition either in the chip set or in any VIA-specific drivers (if
> there are any.) Putting LOCKs in random places will slow the CPU down
> significantly, so it might resolve the race condition without actually
> solving the problem.
>
They are mostly out of the -09143 and -09144 builds -
No cpufreq (I.E: no e_powersaver).
The padlock-* drivers are modules which must be manually loaded.
The i2c-viapro driver (in spite of its comments) does not work
on CX700 (written before manual was released) - it is reading
the serial number rather than the second data port. ;)
(No access to the chipset temperature/voltage data on SMBus).
The via-fb driver just "doesn't work" - Haven't looked at it yet.
There is a VIA-specific driver for the VIA USB controller, but it
isn't in the x86 part of the tree - Haven't looked at it yet.
There isn't a driver for the hardware watchdog on CX700 -
There isn't a driver for the machine error reporting -
= = = =
Although there may be timing requirement differences on the
CX700 and CN896 - I think more likely a human error (typo)
in the "clobber" lines of the asm - Have not yet audited that,
but it is high on my list.
Note: I have seem to recall that newer gcc's optimizer presume
that the flags register is preserved across asm -
It didn't use to do that - but there is now a "cc" to deal with
that - Have not yet audited for that, but it is high on my list.
Busy, busy, busy - -
The -09144lk on C7-M/CX700 now up for 3 3/4 hours close to a new
record - but ehci-hcd has not yet gone into a re-try loop.
Mike
> -hpa
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 12:27 ` Michael S. Zick
2009-05-24 17:22 ` Harald Welte
@ 2009-05-24 18:00 ` H. Peter Anvin
2009-05-24 18:32 ` Michael S. Zick
2009-05-28 20:30 ` Pavel Machek
1 sibling, 2 replies; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-24 18:00 UTC (permalink / raw)
To: lkml; +Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Michael S. Zick wrote:
>
> @hpa - I still like your suggestion that it is only one (or a few)
> uses of atomic ops that is incorrect and in general atomic ops
> should compile away on uni-processor.
>
Actually, the more I think about it the more I suspect there is a race
condition either in the chip set or in any VIA-specific drivers (if
there are any.) Putting LOCKs in random places will slow the CPU down
significantly, so it might resolve the race condition without actually
solving the problem.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 12:38 ` Michael S. Zick
@ 2009-05-24 17:31 ` Harald Welte
0 siblings, 0 replies; 65+ messages in thread
From: Harald Welte @ 2009-05-24 17:31 UTC (permalink / raw)
To: Michael S. Zick
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Hi Michael,
On Sun, May 24, 2009 at 07:38:44AM -0500, Michael S. Zick wrote:
> > As far as I know, there really is no such documentation.. all documentation
> > that I've ever seen internally is electrical data sheets and high-level feature
> > set descriptiosn, CPUID, MSR and padlock. There are no actual x86 instruction
> > set documents... Centaur is < 100 people, they don't have the resources to work
> > on documents along the lines of what Intel has...
>
> My background is in the electronic hardware end of things - -
> Is there someone I can contact for the existing documents -
> Even under NDA would be fine.
I have inquired right now. The regular NDA process I would assume is probably
quite slow. The CPU documentation is already on its track for becoming public
at some point (but very slooooow track), so I'll see what I can do and contact
you in private mail.
> For instance, the layout of the CPUID results - they don't
> currently seem to match what the marketing people claim is
> inside of the chips. There are some "VIA specific" fields.
There's two versions of the C7-M, an 'A' model (90nm SOI) and a much more
recent 'D' model (90nm conventional process). They CPUID values are 6-a and
6-d, respectively. The cpu ID string of the former ones contains Esther,
the latter one contains C7-M - but in fact any BIOS could override the cpu
ID string (not cpuid!) with whatever they want using a backdoor in some MSR.
> Could you also dig around for a tech manual on CN896 similar to
> the one (of two) CX700 manuals that are publicly posted?
I've asked about that. The programming guides for chipsets are generally on
the 'open track', whereas the electrical data sheets with pinouts and timing
values are under NDA.
The CN896 was just already an "old" component when that new open-track policy
was introduced, and typically VIA is trying to focus on docs and drivers for
new products, rather than old ones. But I have asked if we can release the
CN896 programming manual public.
> Even under NDA is fine.
Well, I prefer to make sure that we have the neccessary information open.
NDA's are fine and well for the limited number of customers you have, but
makign NDA's with various individual programmers really is too painful,
there should be other ways...
--
- Harald Welte <HaraldWelte@viatech.com> http://linux.via.com.tw/
============================================================================
VIA Open Source Liaison
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 12:27 ` Michael S. Zick
@ 2009-05-24 17:22 ` Harald Welte
2009-05-24 18:00 ` H. Peter Anvin
1 sibling, 0 replies; 65+ messages in thread
From: Harald Welte @ 2009-05-24 17:22 UTC (permalink / raw)
To: Michael S. Zick
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sun, May 24, 2009 at 07:27:27AM -0500, Michael S. Zick wrote:
> *) The difference on the C7-M/CX700 between the -09143 and -09143lk
> I consider significant.
I agree.
> ***) But, keep in mind, just because the system chip set is different,
> there are other unknowns - -
> We can *not* say at the moment that both machines where using the same
> execution paths - even though the binaries where identical.
yes, of course.
> Also, there where probably different external modules loaded in the
> two runs - not many, mostly things are built-in.
>
> The truly significant point on the C7-M/CX700 running -09143lk was that
> when the echi-hcd driver got hung in its failure loop, generating a
> flood of messages - it did not take down or lock the kernel.
>
> I consider this "forward progress" - it should be possible to build-in
> the lock-dep checkers and get something in the message buffer -
> rather than just have the machine halt. Its hard to debug a halted
> machine with only a glowing power-on light for feed-back. ;)
well, if you're not working with notebooks but actual regular mainboard
devices, then you should have a serial console and possibly still have
magic sysrq or at least some other interesting information on the console.
I personally don't have access to a CX700 based board at the moment, and due to
my travel schedule I won't get that before June 6th. However, I do have access
to C7-M boards with VX800 and VX855. However, they don't use the VIA Rhine
Ethernet chip, so if you are triggering the bug with that driver, it is
unlikely to occur there.
Meanwhile, I will inquire what the CPU guys think should happen with regard
to the LOCK prefix. If their view of the world of what they expect from the
hardware is already different from our assumptions, we can save ourselves
time consuming testing...
Regards,
--
- Harald Welte <HaraldWelte@viatech.com> http://linux.via.com.tw/
============================================================================
VIA Open Source Liaison
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 7:04 ` Harald Welte
2009-05-24 12:48 ` Michael S. Zick
@ 2009-05-24 15:43 ` Michael S. Zick
1 sibling, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-24 15:43 UTC (permalink / raw)
To: Harald Welte
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sun May 24 2009, Harald Welte wrote:
> On Sat, May 23, 2009 at 05:44:52AM -0500, Michael S. Zick wrote:
>
> > *) The HP-2133 MiniNote uses the cn896 chipset, which
> > has not yet been released from NDA.
>
> I can see towards getting that changed, but I doubt this helps us with the
> current problem.
>
> > I do not have the C7-M technical reference, it is still
> > under NDA.
>
> I obviously have access to that documentation (which is also on its way
> to become public, but needs more time) - but believe me, there is nothing
> in that documentation that would help you to debug this problem :(
>
> > *But* if a developer on this list has a copy of the
> > manual *and* owns one of these three brands of machine -
> > they would have fixed their own machine a year ago.
>
> I actually own a 2133 mininote, but I rarely used it for anything but to test
> openchrome on it. What do you suggest me to try?
>
The {,lk} pair of yesterday - now built against tag 2.6.3-rc7 is
posted as -09144{,lk}
Details here:
http://forum.netbookuser.com/viewtopic.php?pid=6976#p6976
Try them on a C7-M/CX700 (or newer NetBook system chipset)
(I don't normally test on the HP-2133 (C7-M/CN896) since I am
not (yet) dealing with the Broadcom firmware and SBB driver.)
Mike
> I also have some other systems with a C7-M, so I can certainly verify
> certain code on a number of them, if a good testcase exists.
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 7:04 ` Harald Welte
@ 2009-05-24 12:48 ` Michael S. Zick
2009-05-24 15:43 ` Michael S. Zick
1 sibling, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-24 12:48 UTC (permalink / raw)
To: Harald Welte; +Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
On Sun May 24 2009, Harald Welte wrote:
> On Sat, May 23, 2009 at 05:44:52AM -0500, Michael S. Zick wrote:
>
> > *) The HP-2133 MiniNote uses the cn896 chipset, which
> > has not yet been released from NDA.
>
> I can see towards getting that changed, but I doubt this helps us with the
> current problem.
>
> > I do not have the C7-M technical reference, it is still
> > under NDA.
>
> I obviously have access to that documentation (which is also on its way
> to become public, but needs more time) - but believe me, there is nothing
> in that documentation that would help you to debug this problem :(
>
> > *But* if a developer on this list has a copy of the
> > manual *and* owns one of these three brands of machine -
> > they would have fixed their own machine a year ago.
>
> I actually own a 2133 mininote, but I rarely used it for anything but to test
> openchrome on it. What do you suggest me to try?
>
The HP-2133 (C7-M/CN896) did not fail yesterday.
Find a C7-M/CX700 machine.
You might hook the rss feed at:
http://forum.netbookuser.com/viewforum.php?id=8
where my rants/raves/speculations are logged and
the other people helping me test make their comments.
In particular:
The original instructions (including download url):
http://forum.netbookuser.com/viewtopic.php?pid=6702#p6702
Updated installation instructions:
http://forum.netbookuser.com/viewtopic.php?id=907
Also ignore anything you read in LKML that I have been doing
this in secret - those authors just never got the memo. ;)
> I also have some other systems with a C7-M, so I can certainly verify
> certain code on a number of them, if a good testcase exists.
>
Still working towards a specific test case - only thing at this
point it the sledge hammer of putting the "lock" back in, everywhere.
Mike
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-24 6:49 ` Harald Welte
@ 2009-05-24 12:38 ` Michael S. Zick
2009-05-24 17:31 ` Harald Welte
2009-05-30 15:48 ` Michael S. Zick
1 sibling, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-24 12:38 UTC (permalink / raw)
To: Harald Welte
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sun May 24 2009, Harald Welte wrote:
> Dear hpa, and others,
>
> On Sat, May 23, 2009 at 04:44:08PM -0700, H. Peter Anvin wrote:
> > It looks like there might be a problem with the C7-M ... Michael reports
> > that if he sets LOCK_PREFIX to "lock;" it works, but that shouldn't be
> > necessary for a uniprocessor.
> >
>
> I will try my best to help, though I have to admit I'm far from being
> a x86 expert, and particularly not with regard to low-level bits such as atomic
> operations.
>
> So please give me some time to research some background about that,
> and read up all the details on the currently reported/described problem.
>
> Once I understand it in full detail, I can talk to the right people inside
> CentaurLabs (VIA's CPU division).
>
> If somebody (optionally) can phrase a precise technical question that I can
> directly forward to somebody with low-level x86 knowledge but no Linux background,
> it would definitely help speeding up the process.
>
> > I'm wondering if we have to revive the OOSTORE hack, or some other
> > workaround. It is of course hard for me to track this down since (a) I
> > don't have access to the CPU documentation,
>
> As far as I know, there really is no such documentation.. all documentation
> that I've ever seen internally is electrical data sheets and high-level feature
> set descriptiosn, CPUID, MSR and padlock. There are no actual x86 instruction
> set documents... Centaur is < 100 people, they don't have the resources to work
> on documents along the lines of what Intel has...
>
My background is in the electronic hardware end of things - -
Is there someone I can contact for the existing documents -
Even under NDA would be fine.
For instance, the layout of the CPUID results - they don't
currently seem to match what the marketing people claim is
inside of the chips. There are some "VIA specific" fields.
Also, those funny looking electrical data sheets with the wiggly
lines will mean something to me in terms of when to use the
"lock" prefix. All you have to do is grow up with such things. ;)
Could you also dig around for a tech manual on CN896 similar to
the one (of two) CX700 manuals that are publicly posted?
Even under NDA is fine.
> > and (b) I work for Intel now, which limits the amount of time I can
> > realistically spend on this.
>
I might be able to get you a machine, but if you are scanned
at the front door for VIA or AMD hardware. . . ;)
Mike
> Sure, thanks for letting me know.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 23:44 ` H. Peter Anvin
2009-05-24 6:49 ` Harald Welte
@ 2009-05-24 12:27 ` Michael S. Zick
2009-05-24 17:22 ` Harald Welte
2009-05-24 18:00 ` H. Peter Anvin
1 sibling, 2 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-24 12:27 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Harald Welte, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
On Sat May 23 2009, H. Peter Anvin wrote:
> Hi Harald,
>
> It looks like there might be a problem with the C7-M ... Michael reports
> that if he sets LOCK_PREFIX to "lock;" it works, but that shouldn't be
> necessary for a uniprocessor.
>
> I'm wondering if we have to revive the OOSTORE hack, or some other
> workaround. It is of course hard for me to track this down since (a) I
> don't have access to the CPU documentation, and (b) I work for Intel
> now, which limits the amount of time I can realistically spend on this.
>
@hpa - I still like your suggestion that it is only one (or a few)
uses of atomic ops that is incorrect and in general atomic ops
should compile away on uni-processor.
Let me translate the findings (see further in the included post) -
The C7-M/CN896 (no tech manual released for CN896 yet) and
the C7-M/CX700 (tech manual released since drivers written)
*) I never tested -09143lk on the C7-M/CN896 because -09143 did
not fail all day (a record for 2.6.30 at the moment).
*) The difference on the C7-M/CX700 between the -09143 and -09143lk
I consider significant.
***) But, keep in mind, just because the system chip set is different,
there are other unknowns - -
We can *not* say at the moment that both machines where using the same
execution paths - even though the binaries where identical.
Also, there where probably different external modules loaded in the
two runs - not many, mostly things are built-in.
The truly significant point on the C7-M/CX700 running -09143lk was that
when the echi-hcd driver got hung in its failure loop, generating a
flood of messages - it did not take down or lock the kernel.
I consider this "forward progress" - it should be possible to build-in
the lock-dep checkers and get something in the message buffer -
rather than just have the machine halt. Its hard to debug a halted
machine with only a glowing power-on light for feed-back. ;)
Mike
> -hpa
>
> [Cc: Alan, who I believed developed the OOSTORE hack back when.]
>
>
> Michael S. Zick wrote:
> > On Fri May 22 2009, H. Peter Anvin wrote:
> >> Michael S. Zick wrote:
> >>> Same integrated motherboard.
> >> Which means same CPU, same BIOS, same motherboard (none of which you're
> >> telling us.)
> >>
> >> cpuinfo and dmidecode would be informative.
> >>
> >
> > The -09143lk files are posted.
> >
> > Download location:
> > http://hp-umpc.com/ce1200v
> >
> > Details so far today:
> > http://forum.netbookuser.com/viewtopic.php?pid=6968#p6968
> >
> > Summary:
> > HP-2133 (C7-M/CN896) - 09143 - No results - Still up 3 1/2 hours.
> > Cloudbook (C7-M/CX700) - 09143 - 45 minutes.
> > Cloudbook (C7-M/CX700) - 09143lk - No results - Still up 1 1/2 hours.
> >
> > OK - time to look for the missing "memory" in the clobber lists. ;)
> >
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 10:44 ` Michael S. Zick
2009-05-23 11:18 ` Michael S. Zick
@ 2009-05-24 7:04 ` Harald Welte
2009-05-24 12:48 ` Michael S. Zick
2009-05-24 15:43 ` Michael S. Zick
2009-05-27 22:13 ` Roland Dreier
2 siblings, 2 replies; 65+ messages in thread
From: Harald Welte @ 2009-05-24 7:04 UTC (permalink / raw)
To: Michael S. Zick
Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-kernel
On Sat, May 23, 2009 at 05:44:52AM -0500, Michael S. Zick wrote:
> *) The HP-2133 MiniNote uses the cn896 chipset, which
> has not yet been released from NDA.
I can see towards getting that changed, but I doubt this helps us with the
current problem.
> I do not have the C7-M technical reference, it is still
> under NDA.
I obviously have access to that documentation (which is also on its way
to become public, but needs more time) - but believe me, there is nothing
in that documentation that would help you to debug this problem :(
> *But* if a developer on this list has a copy of the
> manual *and* owns one of these three brands of machine -
> they would have fixed their own machine a year ago.
I actually own a 2133 mininote, but I rarely used it for anything but to test
openchrome on it. What do you suggest me to try?
I also have some other systems with a C7-M, so I can certainly verify
certain code on a number of them, if a good testcase exists.
--
- Harald Welte <HaraldWelte@viatech.com> http://linux.via.com.tw/
============================================================================
VIA Open Source Liaison
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 23:44 ` H. Peter Anvin
@ 2009-05-24 6:49 ` Harald Welte
2009-05-24 12:38 ` Michael S. Zick
2009-05-30 15:48 ` Michael S. Zick
2009-05-24 12:27 ` Michael S. Zick
1 sibling, 2 replies; 65+ messages in thread
From: Harald Welte @ 2009-05-24 6:49 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: lkml, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Dear hpa, and others,
On Sat, May 23, 2009 at 04:44:08PM -0700, H. Peter Anvin wrote:
> It looks like there might be a problem with the C7-M ... Michael reports
> that if he sets LOCK_PREFIX to "lock;" it works, but that shouldn't be
> necessary for a uniprocessor.
>
I will try my best to help, though I have to admit I'm far from being
a x86 expert, and particularly not with regard to low-level bits such as atomic
operations.
So please give me some time to research some background about that,
and read up all the details on the currently reported/described problem.
Once I understand it in full detail, I can talk to the right people inside
CentaurLabs (VIA's CPU division).
If somebody (optionally) can phrase a precise technical question that I can
directly forward to somebody with low-level x86 knowledge but no Linux background,
it would definitely help speeding up the process.
> I'm wondering if we have to revive the OOSTORE hack, or some other
> workaround. It is of course hard for me to track this down since (a) I
> don't have access to the CPU documentation,
As far as I know, there really is no such documentation.. all documentation
that I've ever seen internally is electrical data sheets and high-level feature
set descriptiosn, CPUID, MSR and padlock. There are no actual x86 instruction
set documents... Centaur is < 100 people, they don't have the resources to work
on documents along the lines of what Intel has...
> and (b) I work for Intel now, which limits the amount of time I can
> realistically spend on this.
Sure, thanks for letting me know.
--
- Harald Welte <HaraldWelte@viatech.com> http://linux.via.com.tw/
============================================================================
VIA Open Source Liaison
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 18:04 ` Michael S. Zick
@ 2009-05-23 23:44 ` H. Peter Anvin
2009-05-24 6:49 ` Harald Welte
2009-05-24 12:27 ` Michael S. Zick
0 siblings, 2 replies; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-23 23:44 UTC (permalink / raw)
To: Harald Welte; +Cc: lkml, Ingo Molnar, Thomas Gleixner, linux-kernel, Alan Cox
Hi Harald,
It looks like there might be a problem with the C7-M ... Michael reports
that if he sets LOCK_PREFIX to "lock;" it works, but that shouldn't be
necessary for a uniprocessor.
I'm wondering if we have to revive the OOSTORE hack, or some other
workaround. It is of course hard for me to track this down since (a) I
don't have access to the CPU documentation, and (b) I work for Intel
now, which limits the amount of time I can realistically spend on this.
-hpa
[Cc: Alan, who I believed developed the OOSTORE hack back when.]
Michael S. Zick wrote:
> On Fri May 22 2009, H. Peter Anvin wrote:
>> Michael S. Zick wrote:
>>> Same integrated motherboard.
>> Which means same CPU, same BIOS, same motherboard (none of which you're
>> telling us.)
>>
>> cpuinfo and dmidecode would be informative.
>>
>
> The -09143lk files are posted.
>
> Download location:
> http://hp-umpc.com/ce1200v
>
> Details so far today:
> http://forum.netbookuser.com/viewtopic.php?pid=6968#p6968
>
> Summary:
> HP-2133 (C7-M/CN896) - 09143 - No results - Still up 3 1/2 hours.
> Cloudbook (C7-M/CX700) - 09143 - 45 minutes.
> Cloudbook (C7-M/CX700) - 09143lk - No results - Still up 1 1/2 hours.
>
> OK - time to look for the missing "memory" in the clobber lists. ;)
>
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 0:51 ` H. Peter Anvin
` (2 preceding siblings ...)
2009-05-23 18:04 ` Michael S. Zick
@ 2009-05-23 20:51 ` Michael S. Zick
3 siblings, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-23 20:51 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Fri May 22 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> > Same integrated motherboard.
>
> Which means same CPU, same BIOS, same motherboard (none of which you're
> telling us.)
>
HP-2133 (C7-M/CN896) - 09143 - No results - Still up - 6 hours.
A "personal best" for 2.6.30 on VIA hardware.
Cloudbook (C7-M/CX700) - 09143 - 45 minutes.
Cloudbook (C7-M/CX700) - 09143lk - Partial results - Still up - 4 hours.
Sometime recently, the echi (USB-2.0) driver went into its failure loop
but the kernel lived, and the music plays on (less the external mouse).
I think I will put it out of its misery now and take some time off myself.
Mike
> cpuinfo and dmidecode would be informative.
>
> -hpa
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 0:51 ` H. Peter Anvin
2009-05-23 10:44 ` Michael S. Zick
2009-05-23 15:52 ` Michael S. Zick
@ 2009-05-23 18:04 ` Michael S. Zick
2009-05-23 23:44 ` H. Peter Anvin
2009-05-23 20:51 ` Michael S. Zick
3 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-23 18:04 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Fri May 22 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> > Same integrated motherboard.
>
> Which means same CPU, same BIOS, same motherboard (none of which you're
> telling us.)
>
> cpuinfo and dmidecode would be informative.
>
The -09143lk files are posted.
Download location:
http://hp-umpc.com/ce1200v
Details so far today:
http://forum.netbookuser.com/viewtopic.php?pid=6968#p6968
Summary:
HP-2133 (C7-M/CN896) - 09143 - No results - Still up 3 1/2 hours.
Cloudbook (C7-M/CX700) - 09143 - 45 minutes.
Cloudbook (C7-M/CX700) - 09143lk - No results - Still up 1 1/2 hours.
OK - time to look for the missing "memory" in the clobber lists. ;)
Mike
> -hpa
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 0:51 ` H. Peter Anvin
2009-05-23 10:44 ` Michael S. Zick
@ 2009-05-23 15:52 ` Michael S. Zick
2009-05-23 18:04 ` Michael S. Zick
2009-05-23 20:51 ` Michael S. Zick
3 siblings, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-23 15:52 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Fri May 22 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> > Same integrated motherboard.
>
> Which means same CPU, same BIOS, same motherboard (none of which you're
> telling us.)
>
> cpuinfo and dmidecode would be informative.
>
Build:
linux-2.6.30-rc6-ce1200v-09143_2.6.30-rc6-ce1200v-09143-22_i386.deb
The -09143lk later today.
Now also testing on the HP-2133 (C7-M/CN896) in addition
to the Everex Cloudbook/Sylvania gBook (C7-M/CX700).
Additional details:
http://forum.netbookuser.com/viewtopic.php?pid=6968#p6968
Download location:
http://hp-umpc.com/ce1200v/
HP-2133 data capture and the Sylvania/Everex data capture:
hp-2133-data_cap.tgz
sylvania-g-data.tar.gz
Summary:
On the ce1200v - first test 46 minutes uptime.
On the hp-2133 - ?? still running - no results yet.
The -09143lk (yyddd) build not yet tested.
Mike
> -hpa
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 10:44 ` Michael S. Zick
@ 2009-05-23 11:18 ` Michael S. Zick
2009-05-24 7:04 ` Harald Welte
2009-05-27 22:13 ` Roland Dreier
2 siblings, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-23 11:18 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Sat May 23 2009, Michael S. Zick wrote:
> On Fri May 22 2009, H. Peter Anvin wrote:
> > Michael S. Zick wrote:
> > > Same integrated motherboard.
> >
> > Which means same CPU, same BIOS, same motherboard (none of which you're
> > telling us.)
> >
>
> The only objective information is posted here:
> http://lkml.org/lkml/2009/5/20/342
> Everything else related to this problem is subjective.
>
> > cpuinfo and dmidecode would be informative.
> >
>
> Must have hit "reply" rather than "reply all" at some
> critical point along the way.
>
> Browse this directory:
> http://hp-umpc.com/ce1200v/
> Your looking for the:
> http://hp-umpc.com/ce1200v/sylvania-g-data.tar.gz
> The Everex Cloudbook only varies by some strings in the
> dmidecode output.
>
> For logs of speculation and efforts at re-arranging the
> deck chairs on the Titanic:
> http://forum.netbookuser.com/viewforum.php?id=8
>
> I chose to start with the ce1200v because:
> *) It needs the most help;
> *) One of the two tech manuals on the cx700 has been
> published since the drivers where touched.
> see: http://linux.via.com.tw/support/downloadFiles.action
> select cx700/vx700 in right-hand box, click the manual.
> *) The HP-2133 MiniNote uses the cn896 chipset, which
> has not yet been released from NDA.
>
> Note:
> I do not have the C7-M technical reference, it is still
> under NDA.
> *But* if a developer on this list has a copy of the
> manual *and* owns one of these three brands of machine -
> they would have fixed their own machine a year ago.
> So I am not holding my breath until a person is located
> that has both the manual and the machine. ;)
>
As to getting a person with the manuals on-hand and everyone
else together, may I point out that the MUC room is still on-line:
Pick you favorite Jabber MUC client;
JID: cloudbook-group@conference.jabber.cb-chat.com
Which translates to:
Room: cloudbook-group
URL: conference.jabber.cb-chat.com
Password: <none> leave blank in your client, it is a public room.
Note: This is a low-volume server and has "on-demand" rooms enabled;
you want a room for the "hot topic" of the moment;
just "join chat" or "join group" (whatever your client calls it) for
the new room name - the server will create it for you.
If creating rooms, I strongly suggest Gajim - it has the easiest to
use control and administrative features.
Note2: Not using any video, so we will not see your smiling face. ;)
Mike
> Mike
>
>
> > -hpa
> >
> >
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 0:51 ` H. Peter Anvin
@ 2009-05-23 10:44 ` Michael S. Zick
2009-05-23 11:18 ` Michael S. Zick
` (2 more replies)
2009-05-23 15:52 ` Michael S. Zick
` (2 subsequent siblings)
3 siblings, 3 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-23 10:44 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Fri May 22 2009, H. Peter Anvin wrote:
> Michael S. Zick wrote:
> > Same integrated motherboard.
>
> Which means same CPU, same BIOS, same motherboard (none of which you're
> telling us.)
>
The only objective information is posted here:
http://lkml.org/lkml/2009/5/20/342
Everything else related to this problem is subjective.
> cpuinfo and dmidecode would be informative.
>
Must have hit "reply" rather than "reply all" at some
critical point along the way.
Browse this directory:
http://hp-umpc.com/ce1200v/
Your looking for the:
http://hp-umpc.com/ce1200v/sylvania-g-data.tar.gz
The Everex Cloudbook only varies by some strings in the
dmidecode output.
For logs of speculation and efforts at re-arranging the
deck chairs on the Titanic:
http://forum.netbookuser.com/viewforum.php?id=8
I chose to start with the ce1200v because:
*) It needs the most help;
*) One of the two tech manuals on the cx700 has been
published since the drivers where touched.
see: http://linux.via.com.tw/support/downloadFiles.action
select cx700/vx700 in right-hand box, click the manual.
*) The HP-2133 MiniNote uses the cn896 chipset, which
has not yet been released from NDA.
Note:
I do not have the C7-M technical reference, it is still
under NDA.
*But* if a developer on this list has a copy of the
manual *and* owns one of these three brands of machine -
they would have fixed their own machine a year ago.
So I am not holding my breath until a person is located
that has both the manual and the machine. ;)
Mike
> -hpa
>
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-23 0:45 ` Michael S. Zick
@ 2009-05-23 0:51 ` H. Peter Anvin
2009-05-23 10:44 ` Michael S. Zick
` (3 more replies)
0 siblings, 4 replies; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-23 0:51 UTC (permalink / raw)
To: lkml; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
Michael S. Zick wrote:
> Same integrated motherboard.
Which means same CPU, same BIOS, same motherboard (none of which you're
telling us.)
cpuinfo and dmidecode would be informative.
-hpa
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 23:30 ` H. Peter Anvin
@ 2009-05-23 0:45 ` Michael S. Zick
2009-05-23 0:51 ` H. Peter Anvin
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-23 0:45 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Fri May 22 2009, H. Peter Anvin wrote:
> If there is a driver which relies on locked operations to be atomic with
> respect to the I/O subsystem, it needs to use true locks, not LOCK_PREFIX.
>
> An interrupt cannot interrupt between two parts of a lockable
> instruction even if it isn't locked (there are non-atomic instructions
> in the x86 architecture, but they can never be locked.)
>
> The other thing that you might be seeing is that a locked operation may
> be slow enough to keep an otherwise-present race condition from being
> triggered.
>
> > That tells us nothing, since the CPU technical details are under NDA.
>
> Have you considered that you might be running into a CPU bug or design
> error? There was the out-of-order store bug on the Winchip that needed
> workarounds (CONFIG_X86_OOSTORE) that I don't think were ever well
> tested and might very well have bitrotted?
>
> > All that can be done in this case is report behavior differences from
> > the closest publicly described processor (Pentium-M).
> >
> > For that purpose, I suggest that a single processor box, with other
> > hardware that makes memory access independent of the processor's
> > control using a processor older than P-4 is a potential test bed.
> > "Other hardware that makes memory access..." I previously termed:
> > "buss master DMA" - which is overly specific. It misleads people
> > into thinking I am seeing hardware control issues rather than
> > non-exclusive memory access.
> >
> > My earlier comments about taking an interrupt between the memory read
> > and the memory write operations is from a different manual than the
> > one posted. A manual that only applies to processors older than
> > the ones supported by the Linux kernel.
> > Sorry, my bad, grabbed the wrong book, posted the correct link (SH).
> >
> > Until one or more specific usages of the LOCK_PREFIX macro can be
> > demonstrated to be incorrect (at least for some of the processors
> > using this code) - -
> >
> > Then making the posted change is a single point change that gives a
> > pair of builds (one with, one without) to compare the behavior of on
> > the test bed.
> >
> > It is *not* the preferred change for a general release kernel, the
> > preferred change would be one that makes a specific rather than
> > general correction.
> > Perhaps only for some functions, perhaps only for some of the
> > processors that currently select this code.
> >
> > The observation that executing an unnecessary 'lock' opcode in some
> > cases slows down the machine is not felt by myself to be significant
> > to duplicating my observations. Note: I have been wrong before.
>
> What makes you draw that conclusion, in particular? A lock prefix
> typically slows down the following instruction dramatically, on some
> processors by many hundreds of cycles.
>
> > This is as informative as I can make the message.
> >
> > PS: *not* a single machine failure, tested on five machines, owned
> > by four different people, two brands, with different use histories.
>
> What do they have in common?
>
Same integrated motherboard.
There is very little information to be gained from staring at a glowing
power on light, that only glows back. ;)
The lockdep dump posted is the best source of information.
Other observations -
Here is something which these machines do, which may not be happening
with your choice of test machines:
ACPI: Core revision 20090320
..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...
..... (found apic 0 pin 0) ...
....... works.
Note: This is on a Uni-processor build.
I have not yet examined the code that generates that set of messages.
Might be a broken work-around?
With the LOCK_PREFIX == ""
Test conditions (same as the lockdep dump) -
VLC playing streaming audio over the wired net connection (8139too) -
from 4 to 8 ssh remote terminal sessions, each running "top" set
to use different display intervales (different in 0.1 second steps) -
Fixed cpu speed at half the rated clock (for the purpose of testing).
Now just hang back and listen for 10 minutes to 4 hours -
When the machine stops running -
You will still hear bursts of sound - -
I am *guessing* that this means the chip set and bus clocks are running,
also that DMA is running - with the result that the HD audio driver
is just replaying the same buffer offset.
There is a PCI-to-PCIe bridge in the chip set and the HD audio hardware
(also on chip) is the only thing detected on the PCIe bus.
The "hold down power button to stop" still works -
I presume that means at least that internal timer is still running.
Repeat the above, *with* LOCK_PREFIX == "\n\tlock; "
When the machine stops - with only minutes rather than hours of uptime -
The machine is silent - I presume this means that DMA is not running.
The "hold down power button to stop" still works -
So clocks are not totally off.
= = = =
Either "lock-up" situation acts as if:
*) cpu is halted with interrupts off; or
*) cpu is in a tight loop with interrupts off
The primary difference is that the DMA has been stopped in the second case.
Presuming my two guesses on that subject above are correct.
Mike
> -hpa
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 22:21 ` Michael S. Zick
@ 2009-05-22 23:30 ` H. Peter Anvin
2009-05-23 0:45 ` Michael S. Zick
2009-05-28 12:48 ` Pavel Machek
1 sibling, 1 reply; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-22 23:30 UTC (permalink / raw)
To: lkml; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
If there is a driver which relies on locked operations to be atomic with
respect to the I/O subsystem, it needs to use true locks, not LOCK_PREFIX.
An interrupt cannot interrupt between two parts of a lockable
instruction even if it isn't locked (there are non-atomic instructions
in the x86 architecture, but they can never be locked.)
The other thing that you might be seeing is that a locked operation may
be slow enough to keep an otherwise-present race condition from being
triggered.
> That tells us nothing, since the CPU technical details are under NDA.
Have you considered that you might be running into a CPU bug or design
error? There was the out-of-order store bug on the Winchip that needed
workarounds (CONFIG_X86_OOSTORE) that I don't think were ever well
tested and might very well have bitrotted?
> All that can be done in this case is report behavior differences from
> the closest publicly described processor (Pentium-M).
>
> For that purpose, I suggest that a single processor box, with other
> hardware that makes memory access independent of the processor's
> control using a processor older than P-4 is a potential test bed.
> "Other hardware that makes memory access..." I previously termed:
> "buss master DMA" - which is overly specific. It misleads people
> into thinking I am seeing hardware control issues rather than
> non-exclusive memory access.
>
> My earlier comments about taking an interrupt between the memory read
> and the memory write operations is from a different manual than the
> one posted. A manual that only applies to processors older than
> the ones supported by the Linux kernel.
> Sorry, my bad, grabbed the wrong book, posted the correct link (SH).
>
> Until one or more specific usages of the LOCK_PREFIX macro can be
> demonstrated to be incorrect (at least for some of the processors
> using this code) - -
>
> Then making the posted change is a single point change that gives a
> pair of builds (one with, one without) to compare the behavior of on
> the test bed.
>
> It is *not* the preferred change for a general release kernel, the
> preferred change would be one that makes a specific rather than
> general correction.
> Perhaps only for some functions, perhaps only for some of the
> processors that currently select this code.
>
> The observation that executing an unnecessary 'lock' opcode in some
> cases slows down the machine is not felt by myself to be significant
> to duplicating my observations. Note: I have been wrong before.
What makes you draw that conclusion, in particular? A lock prefix
typically slows down the following instruction dramatically, on some
processors by many hundreds of cycles.
> This is as informative as I can make the message.
>
> PS: *not* a single machine failure, tested on five machines, owned
> by four different people, two brands, with different use histories.
What do they have in common?
-hpa
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 18:59 ` H. Peter Anvin
2009-05-22 19:20 ` Michael S. Zick
@ 2009-05-22 22:21 ` Michael S. Zick
2009-05-22 23:30 ` H. Peter Anvin
2009-05-28 12:48 ` Pavel Machek
1 sibling, 2 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 22:21 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Fri May 22 2009, H. Peter Anvin wrote:
> Ingo Molnar wrote:
> > * Michael S. Zick <lkml@morethan.org> wrote:
> >
> >> Found in the bit-rot for 32-bit, x86, Uni-processor builds:
> >>
> >> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> >> index f6aa18e..3c790ef 100644
> >> --- a/arch/x86/include/asm/alternative.h
> >> +++ b/arch/x86/include/asm/alternative.h
> >> @@ -35,7 +35,7 @@
> >> "661:\n\tlock; "
> >>
> >> #else /* ! CONFIG_SMP */
> >> -#define LOCK_PREFIX ""
> >> +#define LOCK_PREFIX "\n\tlock; "
> >> #endif
> >
> > What is your motivation for this change? At first sight this makes
> > the UP kernel a bit larger and a bit smaller. Are you fixing some
> > real regression/bug here?
> >
>
> That looks very odd indeed. The whole point of the LOCK_PREFIX macro is
> to squelch it on UP (locks that should not be squelched on UP should not
> be annotated LOCK_PREFIX.)
>
I can only act as a messanger to report the behavior I observe;
But let me see if I can't do a better job of that limited role.
hpa makes the best point of all in the responses here...
What I see (erratic operation, erratic lock-ups of the machine,
and the previously posted lockdep dump) -
This may well be misplaced usage of the LOCK_PREFIX macro;
I have already agreed to keep my eyes open for this more
specific problem.
A secondary possibility, hinted at in the context of other replies;
The usage of the LOCK_PREFIX may not apply equally to all processors
for which this code gets included.
It is possible that I am building for one of the exceptions.
That tells us nothing, since the CPU technical details are under NDA.
All that can be done in this case is report behavior differences from
the closest publicly described processor (Pentium-M).
For that purpose, I suggest that a single processor box, with other
hardware that makes memory access independent of the processor's
control using a processor older than P-4 is a potential test bed.
"Other hardware that makes memory access..." I previously termed:
"buss master DMA" - which is overly specific. It misleads people
into thinking I am seeing hardware control issues rather than
non-exclusive memory access.
My earlier comments about taking an interrupt between the memory read
and the memory write operations is from a different manual than the
one posted. A manual that only applies to processors older than
the ones supported by the Linux kernel.
Sorry, my bad, grabbed the wrong book, posted the correct link (SH).
Until one or more specific usages of the LOCK_PREFIX macro can be
demonstrated to be incorrect (at least for some of the processors
using this code) - -
Then making the posted change is a single point change that gives a
pair of builds (one with, one without) to compare the behavior of on
the test bed.
It is *not* the preferred change for a general release kernel, the
preferred change would be one that makes a specific rather than
general correction.
Perhaps only for some functions, perhaps only for some of the
processors that currently select this code.
The observation that executing an unnecessary 'lock' opcode in some
cases slows down the machine is not felt by myself to be significant
to duplicating my observations. Note: I have been wrong before.
This is as informative as I can make the message.
PS: *not* a single machine failure, tested on five machines, owned
by four different people, two brands, with different use histories.
Mike
> -hpa
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 22:00 ` Samuel Thibault
@ 2009-05-22 22:14 ` Andi Kleen
2009-05-22 22:14 ` Samuel Thibault
0 siblings, 1 reply; 65+ messages in thread
From: Andi Kleen @ 2009-05-22 22:14 UTC (permalink / raw)
To: Samuel Thibault, Andi Kleen, Michael S. Zick, linux-kernel
> That's what I meant: AIUI, LOCK_PREFIX has always only been used for
> inter-processor interaction (atomic variables, spinlocks, etc.), not for
PCI has a locked transaction, but I don't think it's widely supported.
With normal uncached access it is also not very useful.
> processor-device interaction.
Well in Linux yes, but not architecturally in x86. That is why the CPUs
don't just nop it out with a single core (which Michael assumes they do,
but they don't)
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 22:14 ` Andi Kleen
@ 2009-05-22 22:14 ` Samuel Thibault
0 siblings, 0 replies; 65+ messages in thread
From: Samuel Thibault @ 2009-05-22 22:14 UTC (permalink / raw)
To: Andi Kleen; +Cc: Michael S. Zick, linux-kernel
Andi Kleen, le Sat 23 May 2009 00:14:56 +0200, a écrit :
> > That's what I meant: AIUI, LOCK_PREFIX has always only been used for
> > inter-processor interaction (atomic variables, spinlocks, etc.), not for
>
> PCI has a locked transaction, but I don't think it's widely supported.
> With normal uncached access it is also not very useful.
I'm not talking about the LOCK prefix. I'm talking about the
LOCK_PREFIX macro. I'm saying that AIUI it has never been supposed to
be used for procesor-device interaction, even if the LOCK prefix could
be used for that.
Samuel
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 21:59 ` Andi Kleen
@ 2009-05-22 22:00 ` Samuel Thibault
2009-05-22 22:14 ` Andi Kleen
0 siblings, 1 reply; 65+ messages in thread
From: Samuel Thibault @ 2009-05-22 22:00 UTC (permalink / raw)
To: Andi Kleen; +Cc: Michael S. Zick, linux-kernel
Andi Kleen, le Fri 22 May 2009 23:59:39 +0200, a écrit :
> > > - You got bus master DMA in your test machine?
> >
> > That's not related to the LOCK_PREFIX concern, which is about the
> > processor only, not interaction with other devices.
>
> Actually it's related to other devices; but only very few (most MMIO
> doesn't support atomic cycles and is uncached anyways). But there's no driver
> for real hardware in Linux that relies on it to my knowledge.
That's what I meant: AIUI, LOCK_PREFIX has always only been used for
inter-processor interaction (atomic variables, spinlocks, etc.), not for
processor-device interaction.
Samuel
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 20:43 ` Samuel Thibault
@ 2009-05-22 21:59 ` Andi Kleen
2009-05-22 22:00 ` Samuel Thibault
0 siblings, 1 reply; 65+ messages in thread
From: Andi Kleen @ 2009-05-22 21:59 UTC (permalink / raw)
To: Samuel Thibault, Michael S. Zick, Andi Kleen, linux-kernel
> > - You got bus master DMA in your test machine?
>
> That's not related to the LOCK_PREFIX concern, which is about the
> processor only, not interaction with other devices.
Actually it's related to other devices; but only very few (most MMIO
doesn't support atomic cycles and is uncached anyways). But there's no driver
for real hardware in Linux that relies on it to my knowledge.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 20:42 ` Andi Kleen
@ 2009-05-22 20:57 ` Michael S. Zick
0 siblings, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 20:57 UTC (permalink / raw)
To: Andi Kleen; +Cc: Samuel Thibault, linux-kernel
On Fri May 22 2009, Andi Kleen wrote:
> > If it: "Does not make a difference" then it "Should not make a difference"
> > but it does, try it yourself. Its safe (if LOCK_PREFIX is in the proper
> > places) - the machine will ignore the opcode if is recent enough to not
> > need it - just trust the cpu's micro-code.
>
> It doesn't ignore it, in fact it's extremly slow on some older systems
> where all atomic operations are very costly.
> That is why LOCK is avoided as much as possible.
>
I'm only the messanger.
Mike
> -Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 20:32 ` Michael S. Zick
2009-05-22 20:42 ` Andi Kleen
2009-05-22 20:43 ` Samuel Thibault
@ 2009-05-22 20:45 ` Roland Dreier
2009-05-24 18:59 ` Robert Hancock
3 siblings, 0 replies; 65+ messages in thread
From: Roland Dreier @ 2009-05-22 20:45 UTC (permalink / raw)
To: lkml; +Cc: Samuel Thibault, Andi Kleen, linux-kernel
> > > Ref: http://developer.intel.com/Assets/PDF/manual/253666.pdf
> > > Manual page: 3-590 PDF page: 638
> > > Summary: Processors prior to P-4 can take an interrupt between
> > > the read cycle and the write cycle. Which is why opcode 0xF0 exists.
> > Where do you see page 638/639 talking about interrupts? It talks about
> > multi-processor machines.
> No - it talks about "exclusive memory access" - You got bus master DMA
> in your test machine? You also have an older than P-4 single processor?
I looked at the page you refer to. I talks about asserting the LOCK#
signal -- there is absolutely no mention of the lock prefix having any
effect on the execution of an instruction internal to a single CPU.
Could you be more specific about what you are referring to?
> Look people, I just reported what I found from testing -
> Please don't shoot the messanger.
Could you be specific about the test you are doing? What operation are
you doing that is missing the lock prefix? What is the expected result,
and what actually happens without the lock prefix?
- R.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 20:32 ` Michael S. Zick
2009-05-22 20:42 ` Andi Kleen
@ 2009-05-22 20:43 ` Samuel Thibault
2009-05-22 21:59 ` Andi Kleen
2009-05-22 20:45 ` Roland Dreier
2009-05-24 18:59 ` Robert Hancock
3 siblings, 1 reply; 65+ messages in thread
From: Samuel Thibault @ 2009-05-22 20:43 UTC (permalink / raw)
To: Michael S. Zick; +Cc: Andi Kleen, linux-kernel
Michael S. Zick, le Fri 22 May 2009 15:32:41 -0500, a écrit :
> On Fri May 22 2009, Samuel Thibault wrote:
> > Michael S. Zick, le Fri 22 May 2009 14:53:39 -0500, a écrit :
> > > Ref: http://developer.intel.com/Assets/PDF/manual/253666.pdf
> > > Manual page: 3-590 PDF page: 638
> > > Summary: Processors prior to P-4 can take an interrupt between
> > > the read cycle and the write cycle. Which is why opcode 0xF0 exists.
> >
> > Where do you see page 638/639 talking about interrupts? It talks about
> > multi-processor machines.
>
> No - it talks about "exclusive memory access"
Right, that's still not interrupts.
> - You got bus master DMA in your test machine?
That's not related to the LOCK_PREFIX concern, which is about the
processor only, not interaction with other devices.
> Look people, I just reported what I found from testing -
What did you test, precisely?
Samuel
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 20:32 ` Michael S. Zick
@ 2009-05-22 20:42 ` Andi Kleen
2009-05-22 20:57 ` Michael S. Zick
2009-05-22 20:43 ` Samuel Thibault
` (2 subsequent siblings)
3 siblings, 1 reply; 65+ messages in thread
From: Andi Kleen @ 2009-05-22 20:42 UTC (permalink / raw)
To: Michael S. Zick; +Cc: Samuel Thibault, Andi Kleen, linux-kernel
> If it: "Does not make a difference" then it "Should not make a difference"
> but it does, try it yourself. Its safe (if LOCK_PREFIX is in the proper
> places) - the machine will ignore the opcode if is recent enough to not
> need it - just trust the cpu's micro-code.
It doesn't ignore it, in fact it's extremly slow on some older systems
where all atomic operations are very costly.
That is why LOCK is avoided as much as possible.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 20:05 ` Samuel Thibault
@ 2009-05-22 20:32 ` Michael S. Zick
2009-05-22 20:42 ` Andi Kleen
` (3 more replies)
0 siblings, 4 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 20:32 UTC (permalink / raw)
To: Samuel Thibault; +Cc: Andi Kleen, linux-kernel
On Fri May 22 2009, Samuel Thibault wrote:
> Michael S. Zick, le Fri 22 May 2009 14:53:39 -0500, a écrit :
> > Ref: http://developer.intel.com/Assets/PDF/manual/253666.pdf
> > Manual page: 3-590 PDF page: 638
> > Summary: Processors prior to P-4 can take an interrupt between
> > the read cycle and the write cycle. Which is why opcode 0xF0 exists.
>
> Where do you see page 638/639 talking about interrupts? It talks about
> multi-processor machines.
>
No - it talks about "exclusive memory access" - You got bus master DMA
in your test machine? You also have an older than P-4 single processor?
Look people, I just reported what I found from testing -
Please don't shoot the messanger.
If it: "Does not make a difference" then it "Should not make a difference"
but it does, try it yourself. Its safe (if LOCK_PREFIX is in the proper
places) - the machine will ignore the opcode if is recent enough to not
need it - just trust the cpu's micro-code.
Mike
> Samuel
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 19:53 ` Michael S. Zick
@ 2009-05-22 20:05 ` Samuel Thibault
2009-05-22 20:32 ` Michael S. Zick
0 siblings, 1 reply; 65+ messages in thread
From: Samuel Thibault @ 2009-05-22 20:05 UTC (permalink / raw)
To: Michael S. Zick; +Cc: Andi Kleen, linux-kernel
Michael S. Zick, le Fri 22 May 2009 14:53:39 -0500, a écrit :
> Ref: http://developer.intel.com/Assets/PDF/manual/253666.pdf
> Manual page: 3-590 PDF page: 638
> Summary: Processors prior to P-4 can take an interrupt between
> the read cycle and the write cycle. Which is why opcode 0xF0 exists.
Where do you see page 638/639 talking about interrupts? It talks about
multi-processor machines.
Samuel
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
[not found] ` <20090522192329.GF846@one.firstfloor.org>
@ 2009-05-22 19:53 ` Michael S. Zick
2009-05-22 20:05 ` Samuel Thibault
0 siblings, 1 reply; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 19:53 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
On Fri May 22 2009, Andi Kleen wrote:
> On Fri, May 22, 2009 at 01:43:27PM -0500, Michael S. Zick wrote:
> > On Fri May 22 2009, you wrote:
> > > "Michael S. Zick" <lkml@morethan.org> writes:
> > >
> > > > Found in the bit-rot for 32-bit, x86, Uni-processor builds:
> > >
> > > Actually uni processor should not use the lock prefix
> > > because it doesn't need it; the only exception are some special
> > > ops used in para-virtualization which are special cased.
> > >
> >
> > Unless you have interrupts enabled, then you have two contexts.
>
> Interrupts on the local CPU don't interrupt instructions, only
> inbetween.
>
Ref: http://developer.intel.com/Assets/PDF/manual/253666.pdf
Manual page: 3-590 PDF page: 638
Summary: Processors prior to P-4 can take an interrupt between
the read cycle and the write cycle. Which is why opcode 0xF0 exists.
Mike
> -Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 18:59 ` H. Peter Anvin
@ 2009-05-22 19:20 ` Michael S. Zick
2009-05-22 22:21 ` Michael S. Zick
1 sibling, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 19:20 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel
On Fri May 22 2009, H. Peter Anvin wrote:
> Ingo Molnar wrote:
> > * Michael S. Zick <lkml@morethan.org> wrote:
> >
> >> Found in the bit-rot for 32-bit, x86, Uni-processor builds:
> >>
> >> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> >> index f6aa18e..3c790ef 100644
> >> --- a/arch/x86/include/asm/alternative.h
> >> +++ b/arch/x86/include/asm/alternative.h
> >> @@ -35,7 +35,7 @@
> >> "661:\n\tlock; "
> >>
> >> #else /* ! CONFIG_SMP */
> >> -#define LOCK_PREFIX ""
> >> +#define LOCK_PREFIX "\n\tlock; "
> >> #endif
> >
> > What is your motivation for this change? At first sight this makes
> > the UP kernel a bit larger and a bit smaller. Are you fixing some
> > real regression/bug here?
> >
>
> That looks very odd indeed. The whole point of the LOCK_PREFIX macro is
> to squelch it on UP (locks that should not be squelched on UP should not
> be annotated LOCK_PREFIX.)
>
OK, will inspect for that possibility.
We may just have a mis-use of LOCK_PREFIX.
Mike
> -hpa
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 18:36 ` Ingo Molnar
2009-05-22 18:59 ` H. Peter Anvin
@ 2009-05-22 19:17 ` Michael S. Zick
1 sibling, 0 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 19:17 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel
On Fri May 22 2009, you wrote:
>
> * Michael S. Zick <lkml@morethan.org> wrote:
>
> > Found in the bit-rot for 32-bit, x86, Uni-processor builds:
> >
> > diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> > index f6aa18e..3c790ef 100644
> > --- a/arch/x86/include/asm/alternative.h
> > +++ b/arch/x86/include/asm/alternative.h
> > @@ -35,7 +35,7 @@
> > "661:\n\tlock; "
> >
> > #else /* ! CONFIG_SMP */
> > -#define LOCK_PREFIX ""
> > +#define LOCK_PREFIX "\n\tlock; "
> > #endif
>
> What is your motivation for this change? At first sight this makes
> the UP kernel a bit larger and a bit smaller. Are you fixing some
> real regression/bug here?
>
Yes - but not easy to test for unless you have hardware that can
generate an interrupt flood for long enough period of time to
catch the atomic ops inbetween the read bus cycle and the write
bus cycle - a very small window.
As luck (good? bad? ugly?) would have it, I have a SDHC card and
machine organization that will trigger a flood from the ehci_hcd driver.
A poor man's test setup.
Even with that bit of luck, it takes from minutes to hours to hit the window.
The single lockdep dump I posted was the result of nearly a month's testing.
It is a _small_ window. ;)
Mike
> Ingo
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 18:36 ` Ingo Molnar
@ 2009-05-22 18:59 ` H. Peter Anvin
2009-05-22 19:20 ` Michael S. Zick
2009-05-22 22:21 ` Michael S. Zick
2009-05-22 19:17 ` Michael S. Zick
1 sibling, 2 replies; 65+ messages in thread
From: H. Peter Anvin @ 2009-05-22 18:59 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Michael S. Zick, Thomas Gleixner, linux-kernel
Ingo Molnar wrote:
> * Michael S. Zick <lkml@morethan.org> wrote:
>
>> Found in the bit-rot for 32-bit, x86, Uni-processor builds:
>>
>> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
>> index f6aa18e..3c790ef 100644
>> --- a/arch/x86/include/asm/alternative.h
>> +++ b/arch/x86/include/asm/alternative.h
>> @@ -35,7 +35,7 @@
>> "661:\n\tlock; "
>>
>> #else /* ! CONFIG_SMP */
>> -#define LOCK_PREFIX ""
>> +#define LOCK_PREFIX "\n\tlock; "
>> #endif
>
> What is your motivation for this change? At first sight this makes
> the UP kernel a bit larger and a bit smaller. Are you fixing some
> real regression/bug here?
>
That looks very odd indeed. The whole point of the LOCK_PREFIX macro is
to squelch it on UP (locks that should not be squelched on UP should not
be annotated LOCK_PREFIX.)
-hpa
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 16:39 Michael S. Zick
2009-05-22 18:23 ` Andi Kleen
@ 2009-05-22 18:36 ` Ingo Molnar
2009-05-22 18:59 ` H. Peter Anvin
2009-05-22 19:17 ` Michael S. Zick
[not found] ` <200905221343.30638.lkml@morethan.org>
2 siblings, 2 replies; 65+ messages in thread
From: Ingo Molnar @ 2009-05-22 18:36 UTC (permalink / raw)
To: Michael S. Zick, H. Peter Anvin, Thomas Gleixner; +Cc: linux-kernel
* Michael S. Zick <lkml@morethan.org> wrote:
> Found in the bit-rot for 32-bit, x86, Uni-processor builds:
>
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index f6aa18e..3c790ef 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -35,7 +35,7 @@
> "661:\n\tlock; "
>
> #else /* ! CONFIG_SMP */
> -#define LOCK_PREFIX ""
> +#define LOCK_PREFIX "\n\tlock; "
> #endif
What is your motivation for this change? At first sight this makes
the UP kernel a bit larger and a bit smaller. Are you fixing some
real regression/bug here?
Ingo
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
2009-05-22 16:39 Michael S. Zick
@ 2009-05-22 18:23 ` Andi Kleen
2009-05-22 18:36 ` Ingo Molnar
[not found] ` <200905221343.30638.lkml@morethan.org>
2 siblings, 0 replies; 65+ messages in thread
From: Andi Kleen @ 2009-05-22 18:23 UTC (permalink / raw)
To: lkml; +Cc: linux-kernel
"Michael S. Zick" <lkml@morethan.org> writes:
> Found in the bit-rot for 32-bit, x86, Uni-processor builds:
Actually uni processor should not use the lock prefix
because it doesn't need it; the only exception are some special
ops used in para-virtualization which are special cased.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 65+ messages in thread
* [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic
@ 2009-05-22 16:39 Michael S. Zick
2009-05-22 18:23 ` Andi Kleen
` (2 more replies)
0 siblings, 3 replies; 65+ messages in thread
From: Michael S. Zick @ 2009-05-22 16:39 UTC (permalink / raw)
To: linux-kernel
Found in the bit-rot for 32-bit, x86, Uni-processor builds:
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index f6aa18e..3c790ef 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -35,7 +35,7 @@
"661:\n\tlock; "
#else /* ! CONFIG_SMP */
-#define LOCK_PREFIX ""
+#define LOCK_PREFIX "\n\tlock; "
#endif
/* This must be included *after* the definition of LOCK_PREFIX */
Submitted: M. S. Zick
^ permalink raw reply related [flat|nested] 65+ messages in thread
end of thread, other threads:[~2009-05-30 15:48 UTC | newest]
Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-22 18:50 [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic Michael S. Zick
2009-05-22 19:24 ` Roland Dreier
2009-05-22 20:03 ` Michael S. Zick
-- strict thread matches above, loose matches on Subject: below --
2009-05-22 16:39 Michael S. Zick
2009-05-22 18:23 ` Andi Kleen
2009-05-22 18:36 ` Ingo Molnar
2009-05-22 18:59 ` H. Peter Anvin
2009-05-22 19:20 ` Michael S. Zick
2009-05-22 22:21 ` Michael S. Zick
2009-05-22 23:30 ` H. Peter Anvin
2009-05-23 0:45 ` Michael S. Zick
2009-05-23 0:51 ` H. Peter Anvin
2009-05-23 10:44 ` Michael S. Zick
2009-05-23 11:18 ` Michael S. Zick
2009-05-24 7:04 ` Harald Welte
2009-05-24 12:48 ` Michael S. Zick
2009-05-24 15:43 ` Michael S. Zick
2009-05-27 22:13 ` Roland Dreier
2009-05-27 22:33 ` Michael S. Zick
2009-05-23 15:52 ` Michael S. Zick
2009-05-23 18:04 ` Michael S. Zick
2009-05-23 23:44 ` H. Peter Anvin
2009-05-24 6:49 ` Harald Welte
2009-05-24 12:38 ` Michael S. Zick
2009-05-24 17:31 ` Harald Welte
2009-05-30 15:48 ` Michael S. Zick
2009-05-24 12:27 ` Michael S. Zick
2009-05-24 17:22 ` Harald Welte
2009-05-24 18:00 ` H. Peter Anvin
2009-05-24 18:32 ` Michael S. Zick
2009-05-24 18:46 ` H. Peter Anvin
2009-05-24 19:09 ` Michael S. Zick
2009-05-25 19:03 ` Michael S. Zick
2009-05-25 19:18 ` Michael S. Zick
2009-05-25 19:46 ` Michael S. Zick
2009-05-25 21:10 ` Michael S. Zick
2009-05-25 21:17 ` H. Peter Anvin
2009-05-25 23:03 ` Michael S. Zick
2009-05-25 23:35 ` Michael S. Zick
2009-05-26 0:05 ` H. Peter Anvin
2009-05-26 12:37 ` Michael S. Zick
2009-05-26 17:13 ` H. Peter Anvin
2009-05-25 16:05 ` Michael S. Zick
2009-05-28 20:30 ` Pavel Machek
2009-05-28 20:54 ` Michael S. Zick
2009-05-23 20:51 ` Michael S. Zick
2009-05-28 12:48 ` Pavel Machek
2009-05-28 13:29 ` Michael S. Zick
2009-05-28 20:50 ` Pavel Machek
2009-05-28 20:58 ` Michael S. Zick
2009-05-28 21:16 ` Pavel Machek
2009-05-28 21:21 ` Michael S. Zick
2009-05-22 19:17 ` Michael S. Zick
[not found] ` <200905221343.30638.lkml@morethan.org>
[not found] ` <20090522192329.GF846@one.firstfloor.org>
2009-05-22 19:53 ` Michael S. Zick
2009-05-22 20:05 ` Samuel Thibault
2009-05-22 20:32 ` Michael S. Zick
2009-05-22 20:42 ` Andi Kleen
2009-05-22 20:57 ` Michael S. Zick
2009-05-22 20:43 ` Samuel Thibault
2009-05-22 21:59 ` Andi Kleen
2009-05-22 22:00 ` Samuel Thibault
2009-05-22 22:14 ` Andi Kleen
2009-05-22 22:14 ` Samuel Thibault
2009-05-22 20:45 ` Roland Dreier
2009-05-24 18:59 ` Robert Hancock
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).