LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: 4k stacks in 2.6
       [not found]     ` <203Zu-4aT-15@gated-at.bofh.it>
@ 2004-05-26 13:57       ` Andi Kleen
  2004-05-26 18:17         ` hch
  2004-05-26 20:39         ` Zwane Mwaikambo
       [not found]       ` <206b3-5WN-33@gated-at.bofh.it>
  1 sibling, 2 replies; 39+ messages in thread
From: Andi Kleen @ 2004-05-26 13:57 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: andrea, linux-kernel, arjanv

Ingo Molnar <mingo@elte.hu> writes:
>
> do you realize that the 4K stacks feature also adds a separate softirq
> and a separate hardirq stack? So the maximum footprint is 4K+4K+4K, with

A nice combination would be 8K process stacks with separate irq stacks on 
i386.

Any chance the CONFIGs for those two could be split? 

-Andi


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 13:57       ` 4k stacks in 2.6 Andi Kleen
@ 2004-05-26 18:17         ` hch
  2004-05-26 18:24           ` Andi Kleen
  2004-05-26 20:39         ` Zwane Mwaikambo
  1 sibling, 1 reply; 39+ messages in thread
From: hch @ 2004-05-26 18:17 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, andrea, linux-kernel, arjanv

On Wed, May 26, 2004 at 03:57:05PM +0200, Andi Kleen wrote:
> Ingo Molnar <mingo@elte.hu> writes:
> >
> > do you realize that the 4K stacks feature also adds a separate softirq
> > and a separate hardirq stack? So the maximum footprint is 4K+4K+4K, with
> 
> A nice combination would be 8K process stacks with separate irq stacks on 
> i386.
> 
> Any chance the CONFIGs for those two could be split? 

Any reason not to enable interrupt stacks unconditionally and leave
the stack size choice to the user?


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 18:17         ` hch
@ 2004-05-26 18:24           ` Andi Kleen
  0 siblings, 0 replies; 39+ messages in thread
From: Andi Kleen @ 2004-05-26 18:24 UTC (permalink / raw)
  To: hch, Andi Kleen, Ingo Molnar, andrea, linux-kernel, arjanv

On Wed, May 26, 2004 at 02:17:34PM -0400, hch@infradead.org wrote:
> On Wed, May 26, 2004 at 03:57:05PM +0200, Andi Kleen wrote:
> > Ingo Molnar <mingo@elte.hu> writes:
> > >
> > > do you realize that the 4K stacks feature also adds a separate softirq
> > > and a separate hardirq stack? So the maximum footprint is 4K+4K+4K, with
> > 
> > A nice combination would be 8K process stacks with separate irq stacks on 
> > i386.
> > 
> > Any chance the CONFIGs for those two could be split? 
> 
> Any reason not to enable interrupt stacks unconditionally and leave
> the stack size choice to the user?

It will probably still break some other patches, like debuggers.

Given that the kernel is supposed to be stable I would not change
it unconditionally in 2.6. Maybe in 2.7.

-Andi

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
       [not found]         ` <20baw-1Lz-15@gated-at.bofh.it>
@ 2004-05-26 19:32           ` Andi Kleen
  2004-05-27 11:27             ` Jörn Engel
  0 siblings, 1 reply; 39+ messages in thread
From: Andi Kleen @ 2004-05-26 19:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: joern, mingo, andrea, riel, torvalds, arjanv, linux-kernel

"David S. Miller" <davem@redhat.com> writes:

>> Change gcc to catch stack overflows before the fact and disallow
>> module load unless modules have those checks as well.

It's impossible to do anything but panic, so it's not too helpful
in practice. You can only do better for interrupts
(not handle an interrupt when the stack is too low).

> That's easy, just enable profiling then implement a suitable
> _mcount that checks for stack overflow.  I bet someone has done
> this already.

I did it for x86-64 a long time ago. Should be easy to port to i386 
too.

ftp://ftp.x86-64.org/pub/linux/debug/stackcheck-1

-Andi


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 13:57       ` 4k stacks in 2.6 Andi Kleen
  2004-05-26 18:17         ` hch
@ 2004-05-26 20:39         ` Zwane Mwaikambo
  1 sibling, 0 replies; 39+ messages in thread
From: Zwane Mwaikambo @ 2004-05-26 20:39 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, andrea, linux-kernel, arjanv

On Wed, 26 May 2004, Andi Kleen wrote:

> Ingo Molnar <mingo@elte.hu> writes:
> >
> > do you realize that the 4K stacks feature also adds a separate softirq
> > and a separate hardirq stack? So the maximum footprint is 4K+4K+4K, with
>
> A nice combination would be 8K process stacks with separate irq stacks on
> i386.
>
> Any chance the CONFIGs for those two could be split?

Couldn't this just be done with a THREAD_SIZE config option?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 19:32           ` Andi Kleen
@ 2004-05-27 11:27             ` Jörn Engel
  2004-05-27 13:49               ` Andrea Arcangeli
  0 siblings, 1 reply; 39+ messages in thread
From: Jörn Engel @ 2004-05-27 11:27 UTC (permalink / raw)
  To: Andi Kleen
  Cc: David S. Miller, mingo, andrea, riel, torvalds, arjanv, linux-kernel

On Wed, 26 May 2004 21:32:32 +0200, Andi Kleen wrote:
> "David S. Miller" <davem@redhat.com> writes:
> 
> >> Change gcc to catch stack overflows before the fact and disallow
> >> module load unless modules have those checks as well.
> 
> It's impossible to do anything but panic, so it's not too helpful
> in practice.

Oh, panic is *very* helpful.  Panic won't do random funny things, it
will just stop the machine.  If we got an immediate panic on any stack
overflow, I would want 4k stacks right now.

> > That's easy, just enable profiling then implement a suitable
> > _mcount that checks for stack overflow.  I bet someone has done
> > this already.
> 
> I did it for x86-64 a long time ago. Should be easy to port to i386 
> too.
> 
> ftp://ftp.x86-64.org/pub/linux/debug/stackcheck-1

Cool!  If that is included, I don't have any objections against 4k
stacks anymore.

Jörn

-- 
The cheapest, fastest and most reliable components of a computer
system are those that aren't there.
-- Gordon Bell, DEC labratories

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 11:27             ` Jörn Engel
@ 2004-05-27 13:49               ` Andrea Arcangeli
  2004-05-27 14:15                 ` Jörn Engel
  0 siblings, 1 reply; 39+ messages in thread
From: Andrea Arcangeli @ 2004-05-27 13:49 UTC (permalink / raw)
  To: Jörn Engel
  Cc: Andi Kleen, David S. Miller, mingo, riel, torvalds, arjanv, linux-kernel

On Thu, May 27, 2004 at 01:27:05PM +0200, Jörn Engel wrote:
> Cool!  If that is included, I don't have any objections against 4k
> stacks anymore.

note that it will introduce an huge slowdown, there's no way to enable
that in production. But for testing it's fine.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 13:49               ` Andrea Arcangeli
@ 2004-05-27 14:15                 ` Jörn Engel
  2004-05-27 14:49                   ` Andrea Arcangeli
  0 siblings, 1 reply; 39+ messages in thread
From: Jörn Engel @ 2004-05-27 14:15 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Andi Kleen, David S. Miller, mingo, riel, torvalds, arjanv, linux-kernel

On Thu, 27 May 2004 15:49:50 +0200, Andrea Arcangeli wrote:
> On Thu, May 27, 2004 at 01:27:05PM +0200, Jörn Engel wrote:
> > Cool!  If that is included, I don't have any objections against 4k
> > stacks anymore.
> 
> note that it will introduce an huge slowdown, there's no way to enable
> that in production. But for testing it's fine.

Would it be possible to add something short to the function preamble
on x86 then?  Similar to this code, maybe:

if (!(stack_pointer & 0xe00))	/* less than 512 bytes left */
	*NULL = 1;

Not sure how this can be translated into short and fast x86 assembler,
but if it is possible, I would really like to have it.  Then all we
have left to do is make sure no function ever uses more than 512
bytes.  Famous last words, I know.

Jörn

-- 
Time? What's that? Time is only worth what you do with it.
-- Theo de Raadt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 14:15                 ` Jörn Engel
@ 2004-05-27 14:49                   ` Andrea Arcangeli
  2004-05-27 14:59                     ` Jörn Engel
  0 siblings, 1 reply; 39+ messages in thread
From: Andrea Arcangeli @ 2004-05-27 14:49 UTC (permalink / raw)
  To: Jörn Engel
  Cc: Andi Kleen, David S. Miller, mingo, riel, torvalds, arjanv, linux-kernel

On Thu, May 27, 2004 at 04:15:47PM +0200, Jörn Engel wrote:
> On Thu, 27 May 2004 15:49:50 +0200, Andrea Arcangeli wrote:
> > On Thu, May 27, 2004 at 01:27:05PM +0200, Jörn Engel wrote:
> > > Cool!  If that is included, I don't have any objections against 4k
> > > stacks anymore.
> > 
> > note that it will introduce an huge slowdown, there's no way to enable
> > that in production. But for testing it's fine.
> 
> Would it be possible to add something short to the function preamble
> on x86 then?  Similar to this code, maybe:
> 
> if (!(stack_pointer & 0xe00))	/* less than 512 bytes left */
> 	*NULL = 1;
> 
> Not sure how this can be translated into short and fast x86 assembler,
> but if it is possible, I would really like to have it.  Then all we
> have left to do is make sure no function ever uses more than 512
> bytes.  Famous last words, I know.

If it would be _inlined_ it would be *much* faster, but it would likely
be measurable anyways. Less measurable though. There's no way with gcc
to inline the above in the preamble, one could hack gcc for it though
(there's exactly an asm preable thing in gcc that is the one that is
currently implemented as call mcount plus the register saving, chaning
it to the above may be feasible, though it would need a new option in
gcc)

another nice thing to have (this one zerocost at runtime) would be a
way to set a limit on the size of the local variables for each function.
gcc knows that value very well, it's the sub it does on the stack
pointer the first few asm instructions after the call.  That would
reduce the common mistakes.  An equivalent script is the one from Keith
Owens checking the vmlinux binary after compilation but I'm afraid
people runs that one only after the fact.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 14:49                   ` Andrea Arcangeli
@ 2004-05-27 14:59                     ` Jörn Engel
  2004-05-27 15:08                       ` Keith Owens
  0 siblings, 1 reply; 39+ messages in thread
From: Jörn Engel @ 2004-05-27 14:59 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Andi Kleen, David S. Miller, mingo, riel, torvalds, arjanv, linux-kernel

On Thu, 27 May 2004 16:49:16 +0200, Andrea Arcangeli wrote:
> On Thu, May 27, 2004 at 04:15:47PM +0200, Jörn Engel wrote:
> > 
> > Would it be possible to add something short to the function preamble
> > on x86 then?  Similar to this code, maybe:
> > 
> > if (!(stack_pointer & 0xe00))	/* less than 512 bytes left */
> > 	*NULL = 1;
> > 
> > Not sure how this can be translated into short and fast x86 assembler,
> > but if it is possible, I would really like to have it.  Then all we
> > have left to do is make sure no function ever uses more than 512
> > bytes.  Famous last words, I know.
> 
> If it would be _inlined_ it would be *much* faster, but it would likely
> be measurable anyways. Less measurable though. There's no way with gcc
> to inline the above in the preamble, one could hack gcc for it though
> (there's exactly an asm preable thing in gcc that is the one that is
> currently implemented as call mcount plus the register saving, chaning
> it to the above may be feasible, though it would need a new option in
> gcc)

It is on my list, although I care more about ppc32.  Can anyone
translate the above into assembler?

> another nice thing to have (this one zerocost at runtime) would be a
> way to set a limit on the size of the local variables for each function.
> gcc knows that value very well, it's the sub it does on the stack
> pointer the first few asm instructions after the call.  That would
> reduce the common mistakes.  An equivalent script is the one from Keith
> Owens checking the vmlinux binary after compilation but I'm afraid
> people runs that one only after the fact.

Plus the script is wrong sometimes.  I have had trouble with sizes
around 4G or 2G, and never found the time to really figure out what's
going on.  Might be an alloca thing that got misparsed somehow.

Having the check in gcc should cause less surprises.

Jörn

-- 
It's not whether you win or lose, it's how you place the blame.
-- unknown

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6 
  2004-05-27 14:59                     ` Jörn Engel
@ 2004-05-27 15:08                       ` Keith Owens
  2004-05-27 15:21                         ` Jörn Engel
  0 siblings, 1 reply; 39+ messages in thread
From: Keith Owens @ 2004-05-27 15:08 UTC (permalink / raw)
  To: linux-kernel

On Thu, 27 May 2004 16:59:35 +0200, 
=?iso-8859-1?Q?J=F6rn?= Engel <joern@wohnheim.fh-wedel.de> wrote:
>On Thu, 27 May 2004 16:49:16 +0200, Andrea Arcangeli wrote:
>> On Thu, May 27, 2004 at 04:15:47PM +0200, Jörn Engel wrote:
>> An equivalent script is the one from Keith
>> Owens checking the vmlinux binary after compilation but I'm afraid
>> people runs that one only after the fact.
>
>Plus the script is wrong sometimes.  I have had trouble with sizes
>around 4G or 2G, and never found the time to really figure out what's
>going on.  Might be an alloca thing that got misparsed somehow.

Some code results in negative adjustments to the stack size on exit,
which look like 4G sizes.  My script checks for those and ignores them.
/^[89a-f].......$/d;


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 15:08                       ` Keith Owens
@ 2004-05-27 15:21                         ` Jörn Engel
  2004-05-27 15:34                           ` Arjan van de Ven
  0 siblings, 1 reply; 39+ messages in thread
From: Jörn Engel @ 2004-05-27 15:21 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

On Fri, 28 May 2004 01:08:02 +1000, Keith Owens wrote:
> On Thu, 27 May 2004 16:59:35 +0200, 
> =?iso-8859-1?Q?J=F6rn?= Engel <joern@wohnheim.fh-wedel.de> wrote:
> >
> >Plus the script is wrong sometimes.  I have had trouble with sizes
> >around 4G or 2G, and never found the time to really figure out what's
> >going on.  Might be an alloca thing that got misparsed somehow.
> 
> Some code results in negative adjustments to the stack size on exit,
> which look like 4G sizes.  My script checks for those and ignores them.
> /^[89a-f].......$/d;

Ok, looks as if only my script is wrong.  Do you know what exactly
causes such a negative adjustment?

Jörn

-- 
Optimizations always bust things, because all optimizations are, in
the long haul, a form of cheating, and cheaters eventually get caught.
-- Larry Wall 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 15:21                         ` Jörn Engel
@ 2004-05-27 15:34                           ` Arjan van de Ven
  2004-05-27 15:46                             ` Jörn Engel
  2004-06-01  5:25                             ` Jörn Engel
  0 siblings, 2 replies; 39+ messages in thread
From: Arjan van de Ven @ 2004-05-27 15:34 UTC (permalink / raw)
  To: Jörn Engel; +Cc: Keith Owens, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 931 bytes --]

On Thu, 2004-05-27 at 17:21, Jörn Engel wrote:
> On Fri, 28 May 2004 01:08:02 +1000, Keith Owens wrote:
> > On Thu, 27 May 2004 16:59:35 +0200, 
> > =?iso-8859-1?Q?J=F6rn?= Engel <joern@wohnheim.fh-wedel.de> wrote:
> > >
> > >Plus the script is wrong sometimes.  I have had trouble with sizes
> > >around 4G or 2G, and never found the time to really figure out what's
> > >going on.  Might be an alloca thing that got misparsed somehow.
> > 
> > Some code results in negative adjustments to the stack size on exit,
> > which look like 4G sizes.  My script checks for those and ignores them.
> > /^[89a-f].......$/d;
> 
> Ok, looks as if only my script is wrong.  Do you know what exactly
> causes such a negative adjustment?

you can write "add 100,%esp" as "sub -100, %esp" :)
compilers seem to do that at times, probably some cpu model inside the
compiler decides the later is better code in some cases  :)


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 15:34                           ` Arjan van de Ven
@ 2004-05-27 15:46                             ` Jörn Engel
  2004-06-01  5:25                             ` Jörn Engel
  1 sibling, 0 replies; 39+ messages in thread
From: Jörn Engel @ 2004-05-27 15:46 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Keith Owens, linux-kernel

On Thu, 27 May 2004 17:34:26 +0200, Arjan van de Ven wrote:
> 
> you can write "add 100,%esp" as "sub -100, %esp" :)
> compilers seem to do that at times, probably some cpu model inside the
> compiler decides the later is better code in some cases  :)

Makes sense (in a way).  For x86 and ppc*, my script should be safe as
a nice side effect:
qr/^.*sub    \$(0x$x{3,5}),\%esp$/o

Anything above 5 digits is ignored.  That also misses allocations
above 1MB, but as long as human stupidity is finite... ;)

Jörn

-- 
ticks = jiffies;
while (ticks == jiffies);
ticks = jiffies;
-- /usr/src/linux/init/main.c

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 15:34                           ` Arjan van de Ven
  2004-05-27 15:46                             ` Jörn Engel
@ 2004-06-01  5:25                             ` Jörn Engel
  1 sibling, 0 replies; 39+ messages in thread
From: Jörn Engel @ 2004-06-01  5:25 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Keith Owens, linux-kernel

On Thu, 27 May 2004 17:34:26 +0200, Arjan van de Ven wrote:
> 
> you can write "add 100,%esp" as "sub -100, %esp" :)
> compilers seem to do that at times, probably some cpu model inside the
> compiler decides the later is better code in some cases  :)

That and even worse things.  sys_sendfile has a "sub $0x10,%esp"
followed by an "add $0x20,%esp".  Can you explain that one as well?
0x20 is the size of all automatic variables on i386.

I have no idea what kind of trick gcc is playing there, but it appears
to work which makes me only more curious.

Jörn

-- 
Simplicity is prerequisite for reliability.
-- Edsger W. Dijkstra

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-06-08  6:26               ` Arjan van de Ven
@ 2004-06-08  8:45                 ` Jörn Engel
  0 siblings, 0 replies; 39+ messages in thread
From: Jörn Engel @ 2004-06-08  8:45 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Timothy Miller, Ingo Molnar, Andrea Arcangeli, Rik van Riel,
	Linus Torvalds, linux-kernel

On Tue, 8 June 2004 08:26:25 +0200, Arjan van de Ven wrote:
> 
> > That gave me an idea.  Sometimes in chip design, we 'overconstrain' the 
> > logic synthesizer, because static timing analyzers often produce 
> > inaccurate results.  Anyhow, what if we were to go to 4K stacks but in 
> > static code analysis, flag anything which uses more than 2K or even 1K?

With 2.6.6, there are currently just a few non-recursive paths over
3k.  2k will give you a *lot* of output, but if you insist... ;)

http://wh.fh-wedel.de/~joern/data.nointermezzo.cs2.2k.bz2
470k compressed, 65M uncompressed

Feel free to send patches.

> the patch I sent to akpm went to 400 bytes actually, but yeah, even that
> already is debatable.

400 bytes?  That is for a single function, I assume.

Jörn

-- 
Those who come seeking peace without a treaty are plotting.
-- Sun Tzu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-06-07 18:14             ` Timothy Miller
@ 2004-06-08  6:26               ` Arjan van de Ven
  2004-06-08  8:45                 ` Jörn Engel
  0 siblings, 1 reply; 39+ messages in thread
From: Arjan van de Ven @ 2004-06-08  6:26 UTC (permalink / raw)
  To: Timothy Miller
  Cc: Jörn Engel, Ingo Molnar, Andrea Arcangeli, Rik van Riel,
	Linus Torvalds, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 396 bytes --]


> That gave me an idea.  Sometimes in chip design, we 'overconstrain' the 
> logic synthesizer, because static timing analyzers often produce 
> inaccurate results.  Anyhow, what if we were to go to 4K stacks but in 
> static code analysis, flag anything which uses more than 2K or even 1K?

the patch I sent to akpm went to 400 bytes actually, but yeah, even that
already is debatable.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 13:00           ` Jörn Engel
  2004-05-26 13:05             ` Arjan van de Ven
@ 2004-06-07 18:14             ` Timothy Miller
  2004-06-08  6:26               ` Arjan van de Ven
  1 sibling, 1 reply; 39+ messages in thread
From: Timothy Miller @ 2004-06-07 18:14 UTC (permalink / raw)
  To: Jörn Engel
  Cc: Arjan van de Ven, Ingo Molnar, Andrea Arcangeli, Rik van Riel,
	Linus Torvalds, linux-kernel



Jörn Engel wrote:

> But I'll shut up now and see if I can generate better data over the
> weekend.  -test11 still had fun stuff like 3k stack consumption over
> some code paths in a pretty minimal kernel.  Wonder what 2.6.6 will do
> with allyesconfig. ;)

That gave me an idea.  Sometimes in chip design, we 'overconstrain' the 
logic synthesizer, because static timing analyzers often produce 
inaccurate results.  Anyhow, what if we were to go to 4K stacks but in 
static code analysis, flag anything which uses more than 2K or even 1K?


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 14:42                       ` Andrea Arcangeli
@ 2004-06-02 19:40                         ` Bill Davidsen
  0 siblings, 0 replies; 39+ messages in thread
From: Bill Davidsen @ 2004-06-02 19:40 UTC (permalink / raw)
  To: linux-kernel

Andrea Arcangeli wrote:
> On Thu, May 27, 2004 at 04:03:22PM +0200, Arjan van de Ven wrote:
> 
>>In theory you are absolutely right, problem is the current macro..... it's
>>SO much easier to have one stacksize everywhere (and cheaper too) for
>>this... (and it hasn't been a problem so far, esp since the softirq's have
> 
> 
> I see the problem, but then why don't we wait to implement it right, to
> allow 8k irq-stacks before merging into mainline?
> 
> grep for "~s 4k" (i.e. the word "4[kK]" in the subject) on l-k and
> you'll see there's more than just nvidia. one user reported not being
> able to boot at all with 4k stacks since 2.6.6 doesn't have a stack
> overflow in the oops, so I hope he tested w/ and w/o 4KSTACKS option
> enabled to be able to claim what broke his machine is the 4KSTACKS
> option. (his oops doesn't reveal a stack overflow, the thread_info is at
> 0xf000 and the esp is at 0xffxx)
> 
> Making it a config option, is a sort of proof that you agree it can
> break something, or you wouldn't make it a config option in the first
> place. What's the point of making it a configuration option if it cannot
> break anything and if it's not risky? Making it a config option is not
> good, because then some developer may develop leaving 4KSTACKS disabled,
> and then his kernel code might break on the users with 4KSTACKS enabled
> (it's not much different from PREEMPT).  Amittedly code that overflows
> 4k is likely to be not legitimate but not all code is good (the most
> common error is to allocate big strutures locally on the stack with
> local vars), and if developers are able to notice the overflow on their
> own testing it's better.

We have lots of options which may cause problems but are useful for 
special situations, why is this one any different? The only actual 
benefit I've seen quoted for 4k stack is that it improves fork 
performance if memory is so fragmented that there is no 8k block left. 
And my first thought on hearing that was if that's common the VM should 
be investigated. This is a stable kernel, and breaking even such an 
abomination as a binary-only driver for the sake of whoever has this 
vastly fragmented memory seems to be the anthesis of stable.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 18:31                           ` Guy Sotomayor
@ 2004-05-27 19:26                             ` Brian Gerst
  0 siblings, 0 replies; 39+ messages in thread
From: Brian Gerst @ 2004-05-27 19:26 UTC (permalink / raw)
  To: Guy Sotomayor
  Cc: Linus Torvalds, Andrea Arcangeli, Ingo Molnar, Jörn Engel,
	Arjan van de Ven, Rik van Riel, linux-kernel

Guy Sotomayor wrote:

> On Thu, 2004-05-27 at 07:55, Linus Torvalds wrote:
> 
> 
>>"minor implementation detail"?
>>
>>You need to get to the thread info _some_ way, and you need to get to it
>>_fast_. There are really no sane alternatives. I certainly do not want to
>>play games with segments.
> 
> 
> While segments on x86 are in general to be avoided (aka the 286
> segmented memory models) they can be useful for some things in the
> kernel.
> 
> Here's a couple of examples:
>       * dereference gs:0 to get the thread info.  The first element in
>         the structure is its linear address (ie usable for being deref'd
>         off of DS).

The only problem with using %gs as a base register is that reloading it 
on every entry and exit is rather expensive (GDT access and priviledge 
checks) compared to masking bits off %esp.  x86-64 can get away with it 
because it has the swapgs instruction which makes it efficient to use.

>       * use SS to enforce the stack limit.  This way you'd absolutely
>         get an exception when there was a stack overflow (underflow). 
>         SS gets reloaded on entry into the kernel and on interrupts
>         anyway so there really shouldn't be a performance impact.  I
>         haven't looked at all the (potential) gcc implications here so
>         this one may not be completely doable.

Not possible.  GCC completely assumes that we are working with a single 
flat address space.  It has no concept of segmentation at all.

--
				Brian Gerst

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 14:55                         ` Linus Torvalds
  2004-05-27 15:39                           ` Andrea Arcangeli
@ 2004-05-27 18:31                           ` Guy Sotomayor
  2004-05-27 19:26                             ` Brian Gerst
  1 sibling, 1 reply; 39+ messages in thread
From: Guy Sotomayor @ 2004-05-27 18:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrea Arcangeli, Brian Gerst, Ingo Molnar, Jörn Engel,
	Arjan van de Ven, Rik van Riel, linux-kernel

On Thu, 2004-05-27 at 07:55, Linus Torvalds wrote:

> "minor implementation detail"?
> 
> You need to get to the thread info _some_ way, and you need to get to it
> _fast_. There are really no sane alternatives. I certainly do not want to
> play games with segments.

While segments on x86 are in general to be avoided (aka the 286
segmented memory models) they can be useful for some things in the
kernel.

Here's a couple of examples:
      * dereference gs:0 to get the thread info.  The first element in
        the structure is its linear address (ie usable for being deref'd
        off of DS).
      * use SS to enforce the stack limit.  This way you'd absolutely
        get an exception when there was a stack overflow (underflow). 
        SS gets reloaded on entry into the kernel and on interrupts
        anyway so there really shouldn't be a performance impact.  I
        haven't looked at all the (potential) gcc implications here so
        this one may not be completely doable.
-- 

TTFN - Guy


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 14:55                         ` Linus Torvalds
@ 2004-05-27 15:39                           ` Andrea Arcangeli
  2004-05-27 18:31                           ` Guy Sotomayor
  1 sibling, 0 replies; 39+ messages in thread
From: Andrea Arcangeli @ 2004-05-27 15:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brian Gerst, Ingo Molnar, Jörn Engel, Arjan van de Ven,
	Rik van Riel, linux-kernel

On Thu, May 27, 2004 at 07:55:36AM -0700, Linus Torvalds wrote:
> 
> 
> On Thu, 27 May 2004, Andrea Arcangeli wrote:
> >
> > On Thu, May 27, 2004 at 10:18:40AM -0400, Brian Gerst wrote:
> > > The problem on i386 (unlike x86-64) is that the thread_info struct sits 
> > > at the bottom of the stack and is referenced by masking bits off %esp. 
> > > So the stack size must be constant whether in process context or IRQ 
> > > context.
> > 
> > so what, that's a minor implementation detail, pda is a software thing.
> 
> "minor implementation detail"?
> 
> You need to get to the thread info _some_ way, and you need to get to it
> _fast_. There are really no sane alternatives. I certainly do not want to
> play games with segments.

If the page is "even" the thread_info is at the top of the stack. If the
page is "odd" the thread_info is at the bottom of the stack (or the
other way around depending what you mean with "odd" and "even").

the per-cpu irq stack will have the thread_info at both the top and the
bottom of the 8k naturally aligned order1 compound page. The regular
kernel stack will have it at the top or the bottom depending if it's odd
or even.

this should allow 8k irqstack and bh stack fine at in-cpu-core speed w/o
segments or similar.

The only downside is that itadds a branch to current_thread_info that
will have to check the 12th bitflag in the esp before doing andl, the
second downside is having to update two thread_info during irq, instead
of just one.

It would be probably better if the thread_info was just a pointer to a
"pda" instead of being the PDA itself so there are just two writes into
the kernel stack for every irq. In x86-64 this is much more natural
since the pda-pointer is in the cpu 64bit %gs register and that saves a
branch and defereference on the stack for every "current" invocation,
and two writes for every first-irq or first-bh. 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 14:50                       ` Andrea Arcangeli
@ 2004-05-27 14:55                         ` Linus Torvalds
  2004-05-27 15:39                           ` Andrea Arcangeli
  2004-05-27 18:31                           ` Guy Sotomayor
  0 siblings, 2 replies; 39+ messages in thread
From: Linus Torvalds @ 2004-05-27 14:55 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Brian Gerst, Ingo Molnar, Jörn Engel, Arjan van de Ven,
	Rik van Riel, linux-kernel



On Thu, 27 May 2004, Andrea Arcangeli wrote:
>
> On Thu, May 27, 2004 at 10:18:40AM -0400, Brian Gerst wrote:
> > The problem on i386 (unlike x86-64) is that the thread_info struct sits 
> > at the bottom of the stack and is referenced by masking bits off %esp. 
> > So the stack size must be constant whether in process context or IRQ 
> > context.
> 
> so what, that's a minor implementation detail, pda is a software thing.

"minor implementation detail"?

You need to get to the thread info _some_ way, and you need to get to it
_fast_. There are really no sane alternatives. I certainly do not want to
play games with segments.

		Linus

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 14:18                     ` Brian Gerst
@ 2004-05-27 14:50                       ` Andrea Arcangeli
  2004-05-27 14:55                         ` Linus Torvalds
  0 siblings, 1 reply; 39+ messages in thread
From: Andrea Arcangeli @ 2004-05-27 14:50 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Ingo Molnar, Jörn Engel, Arjan van de Ven, Rik van Riel,
	Linus Torvalds, linux-kernel

On Thu, May 27, 2004 at 10:18:40AM -0400, Brian Gerst wrote:
> The problem on i386 (unlike x86-64) is that the thread_info struct sits 
> at the bottom of the stack and is referenced by masking bits off %esp. 
> So the stack size must be constant whether in process context or IRQ 
> context.

so what, that's a minor implementation detail, pda is a software thing.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 14:03                     ` Arjan van de Ven
@ 2004-05-27 14:42                       ` Andrea Arcangeli
  2004-06-02 19:40                         ` Bill Davidsen
  0 siblings, 1 reply; 39+ messages in thread
From: Andrea Arcangeli @ 2004-05-27 14:42 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, Jörn Engel, Rik van Riel, Linus Torvalds, linux-kernel

On Thu, May 27, 2004 at 04:03:22PM +0200, Arjan van de Ven wrote:
> In theory you are absolutely right, problem is the current macro..... it's
> SO much easier to have one stacksize everywhere (and cheaper too) for
> this... (and it hasn't been a problem so far, esp since the softirq's have

I see the problem, but then why don't we wait to implement it right, to
allow 8k irq-stacks before merging into mainline?

grep for "~s 4k" (i.e. the word "4[kK]" in the subject) on l-k and
you'll see there's more than just nvidia. one user reported not being
able to boot at all with 4k stacks since 2.6.6 doesn't have a stack
overflow in the oops, so I hope he tested w/ and w/o 4KSTACKS option
enabled to be able to claim what broke his machine is the 4KSTACKS
option. (his oops doesn't reveal a stack overflow, the thread_info is at
0xf000 and the esp is at 0xffxx)

Making it a config option, is a sort of proof that you agree it can
break something, or you wouldn't make it a config option in the first
place. What's the point of making it a configuration option if it cannot
break anything and if it's not risky? Making it a config option is not
good, because then some developer may develop leaving 4KSTACKS disabled,
and then his kernel code might break on the users with 4KSTACKS enabled
(it's not much different from PREEMPT).  Amittedly code that overflows
4k is likely to be not legitimate but not all code is good (the most
common error is to allocate big strutures locally on the stack with
local vars), and if developers are able to notice the overflow on their
own testing it's better.

Clearly it's more relaxed to merge something knowing with a config
option you can choose if to use 4k or 8k stacks, but I'm not sure if
it's the right thing to do for the long term. If we go 4k stacks, then
I'd prefer that you drop the 4KSTACKS option and force people to reduce
the stack usage in their code, and secondly that we fixup the irqstack
to be 8k.

Plus the allocation errors you had, could be just 2.6 vm issues with
order > 0 allocations, we never had issues with 8k stacks in 2.4, so
using the 4k stacks may just hide the real problem. archs like x86-64
have to use order > 0 allocations for kernel stack, no way around it, so
order > 0 must work reliably regardless of whatever code we change in
x86.

> On x86_64 you have the PDA for current so that's not a problem, and
> you can do the bigger stacks easily but for x86 you don't...

yep.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 13:59                   ` Andrea Arcangeli
  2004-05-27 14:03                     ` Arjan van de Ven
@ 2004-05-27 14:18                     ` Brian Gerst
  2004-05-27 14:50                       ` Andrea Arcangeli
  1 sibling, 1 reply; 39+ messages in thread
From: Brian Gerst @ 2004-05-27 14:18 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Ingo Molnar, Jörn Engel, Arjan van de Ven, Rik van Riel,
	Linus Torvalds, linux-kernel

Andrea Arcangeli wrote:

> On Thu, May 27, 2004 at 02:45:51PM +0200, Ingo Molnar wrote:
> 
>>are a bit belated. I only reacted to Andrea's mail to clear up apparent
>>misunderstandings about the impact and implementation of this feature.
> 
> 
> note that there is something relevant to improve in the implementation,
> that is the per-cpu irq stack size should be bigger than 4k, we use 16k
> on x86-64, on x86 it should be 8k. Currently you're decreasing _both_
> the normal kernel context and even the irq stack in some condition.
> There's no good reason to decrease the irq stack too, that's cheap, it's
> per-cpu.

The problem on i386 (unlike x86-64) is that the thread_info struct sits 
at the bottom of the stack and is referenced by masking bits off %esp. 
So the stack size must be constant whether in process context or IRQ 
context.

--
				Brian Gerst

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 13:59                   ` Andrea Arcangeli
@ 2004-05-27 14:03                     ` Arjan van de Ven
  2004-05-27 14:42                       ` Andrea Arcangeli
  2004-05-27 14:18                     ` Brian Gerst
  1 sibling, 1 reply; 39+ messages in thread
From: Arjan van de Ven @ 2004-05-27 14:03 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Ingo Molnar, Jörn Engel, Rik van Riel, Linus Torvalds, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1231 bytes --]


On Thu, May 27, 2004 at 03:59:30PM +0200, Andrea Arcangeli wrote:
> On Thu, May 27, 2004 at 02:45:51PM +0200, Ingo Molnar wrote:
> > are a bit belated. I only reacted to Andrea's mail to clear up apparent
> > misunderstandings about the impact and implementation of this feature.
> 
> note that there is something relevant to improve in the implementation,
> that is the per-cpu irq stack size should be bigger than 4k, we use 16k
> on x86-64, on x86 it should be 8k. Currently you're decreasing _both_
> the normal kernel context and even the irq stack in some condition.
> There's no good reason to decrease the irq stack too, that's cheap, it's
> per-cpu.

In theory you are absolutely right, problem is the current macro..... it's
SO much easier to have one stacksize everywhere (and cheaper too) for
this... (and it hasn't been a problem so far, esp since the softirq's have
their own stack, irq handlers seem to be all really light on the stack
already since they punt all the heavy lifting to tasklets etc.
Tasklets don't recurse stack wise, and have their own stack, so that ought
to be fine.

On x86_64 you have the PDA for current so that's not a problem, and you can
do the bigger stacks easily but for x86 you don't...

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-27 12:45                 ` Ingo Molnar
@ 2004-05-27 13:59                   ` Andrea Arcangeli
  2004-05-27 14:03                     ` Arjan van de Ven
  2004-05-27 14:18                     ` Brian Gerst
  0 siblings, 2 replies; 39+ messages in thread
From: Andrea Arcangeli @ 2004-05-27 13:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jörn Engel, Arjan van de Ven, Rik van Riel, Linus Torvalds,
	linux-kernel

On Thu, May 27, 2004 at 02:45:51PM +0200, Ingo Molnar wrote:
> are a bit belated. I only reacted to Andrea's mail to clear up apparent
> misunderstandings about the impact and implementation of this feature.

note that there is something relevant to improve in the implementation,
that is the per-cpu irq stack size should be bigger than 4k, we use 16k
on x86-64, on x86 it should be 8k. Currently you're decreasing _both_
the normal kernel context and even the irq stack in some condition.
There's no good reason to decrease the irq stack too, that's cheap, it's
per-cpu.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 16:41               ` Jörn Engel
@ 2004-05-27 12:45                 ` Ingo Molnar
  2004-05-27 13:59                   ` Andrea Arcangeli
  0 siblings, 1 reply; 39+ messages in thread
From: Ingo Molnar @ 2004-05-27 12:45 UTC (permalink / raw)
  To: Jörn Engel
  Cc: Arjan van de Ven, Andrea Arcangeli, Rik van Riel, Linus Torvalds,
	linux-kernel


* Jörn Engel <joern@wohnheim.fh-wedel.de> wrote:

> Anyway, whether we go for 4k in 2.6 or not, [...]

4K stacks have been added to the 2.6 kernel more than a month ago, are
in the official 2.6.6 kernel and are used by FC2 happily, so objections
are a bit belated. I only reacted to Andrea's mail to clear up apparent
misunderstandings about the impact and implementation of this feature.

	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 19:02           ` Matt Mackall
@ 2004-05-26 19:25             ` Dave Jones
  0 siblings, 0 replies; 39+ messages in thread
From: Dave Jones @ 2004-05-26 19:25 UTC (permalink / raw)
  To: Matt Mackall
  Cc: David S. Miller, J?rn Engel, mingo, andrea, riel, torvalds,
	arjanv, linux-kernel

On Wed, May 26, 2004 at 02:02:22PM -0500, Matt Mackall wrote:

 > There was a patch floating around for this in the 2.2 era that I
 > ported to 2.4 on one occassion. It won't tell you worst case though,
 > just worst observed case.
 > 
 > Sparse is probably not a bad place to put a real call chain stack analysis.

That won't measure any dynamic stack allocations that we're doing
at runtime, nor will it test all n combinations of drivers, which
is where most of the stack horrors have been found in recent times.

		Dave


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 18:12         ` David S. Miller
@ 2004-05-26 19:02           ` Matt Mackall
  2004-05-26 19:25             ` Dave Jones
  0 siblings, 1 reply; 39+ messages in thread
From: Matt Mackall @ 2004-05-26 19:02 UTC (permalink / raw)
  To: David S. Miller
  Cc: J?rn Engel, mingo, andrea, riel, torvalds, arjanv, linux-kernel

On Wed, May 26, 2004 at 11:12:22AM -0700, David S. Miller wrote:
> On Wed, 26 May 2004 14:50:14 +0200
> J?rn Engel <joern@wohnheim.fh-wedel.de> wrote:
> 
> > Change gcc to catch stack overflows before the fact and disallow
> > module load unless modules have those checks as well.
> 
> That's easy, just enable profiling then implement a suitable
> _mcount that checks for stack overflow.  I bet someone has done
> this already.

There was a patch floating around for this in the 2.2 era that I
ported to 2.4 on one occassion. It won't tell you worst case though,
just worst observed case.

Sparse is probably not a bad place to put a real call chain stack analysis.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 12:50       ` Jörn Engel
  2004-05-26 12:53         ` Arjan van de Ven
@ 2004-05-26 18:12         ` David S. Miller
  2004-05-26 19:02           ` Matt Mackall
  1 sibling, 1 reply; 39+ messages in thread
From: David S. Miller @ 2004-05-26 18:12 UTC (permalink / raw)
  To: Jörn Engel; +Cc: mingo, andrea, riel, torvalds, arjanv, linux-kernel

On Wed, 26 May 2004 14:50:14 +0200
Jörn Engel <joern@wohnheim.fh-wedel.de> wrote:

> Change gcc to catch stack overflows before the fact and disallow
> module load unless modules have those checks as well.

That's easy, just enable profiling then implement a suitable
_mcount that checks for stack overflow.  I bet someone has done
this already.

For full coverage, some trap entry handler checks in entry.S
would be necessary too of course.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 13:05             ` Arjan van de Ven
@ 2004-05-26 16:41               ` Jörn Engel
  2004-05-27 12:45                 ` Ingo Molnar
  0 siblings, 1 reply; 39+ messages in thread
From: Jörn Engel @ 2004-05-26 16:41 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, Andrea Arcangeli, Rik van Riel, Linus Torvalds,
	linux-kernel

On Wed, 26 May 2004 15:05:00 +0200, Arjan van de Ven wrote:
> 
> You used the word "Never" and now you go away from it.... It wasn't Never,
> and it will never be never if you want to include random binary only
> modules. However in 2.4 for all intents and pruposes there was 4Kb already,
> and now there still is, for user context. Because those interrupts DO
> happen. NVidia was a walking timebomb, and with one function using 4Kb
> that's an obvious Needs-Fix case. The kernel had a few of those in rare
> drivers, most of which have been fixed by now. It'll never be never, but it
> never was never either.

In a way, you are right.  nVidia was and is a walking timebomb and
making bugs more likely to happen is a good thing in general.  Except
that this bug can eat filesystems, so making it more likely will cause
more filesystems to be eaten.

Anyway, whether we go for 4k in 2.6 or not, we should do our best to
fix bad code and I will go looking for some more so others can go and
fix some more.  There's still enough horror in mainline for more than
one amusement park, we just haven't found it yet.

Jörn

-- 
All art is but imitation of nature.
-- Lucius Annaeus Seneca

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
@ 2004-05-26 15:17 Albert Cahalan
  0 siblings, 0 replies; 39+ messages in thread
From: Albert Cahalan @ 2004-05-26 15:17 UTC (permalink / raw)
  To: linux-kernel mailing list; +Cc: mingo, arjanv

Ingo Molnar writes:

> do you realize that the 4K stacks feature also adds
> a separate softirq and a separate hardirq stack?
> So the maximum footprint is 4K+4K+4K, with a clear
> and sane limit for each type of context, while the
> 2.4 kernel has 6.5K for all 3 contexts combined.
> (Also, in 2.4 irq contexts pretty much assumed that
> there's 2K of stack for them - leaving a de-facto 4K
> stack for the process and softirq contexts.) So in fact
> there is more space in 2.6 for all, and i dont really
> understand your fears.

Is that 4K per IRQ (total 64K to 1024K) or 4K total?
If it's total, then it's cheap to go with 32K.

The same goes for softirqs: 4K total, or per softirq?




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 13:00           ` Jörn Engel
@ 2004-05-26 13:05             ` Arjan van de Ven
  2004-05-26 16:41               ` Jörn Engel
  2004-06-07 18:14             ` Timothy Miller
  1 sibling, 1 reply; 39+ messages in thread
From: Arjan van de Ven @ 2004-05-26 13:05 UTC (permalink / raw)
  To: Jörn Engel
  Cc: Ingo Molnar, Andrea Arcangeli, Rik van Riel, Linus Torvalds,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1331 bytes --]

On Wed, May 26, 2004 at 03:00:47PM +0200, Jörn Engel wrote:
> > > Experience indicates that for whatever reason, big stack consumers for
> > > all three contexts never hit at the same time.  Big stack consumers
> > > for one context happen too often, though.  "Too often" may be quite
> > > rare, but considering the result of a stack overflow, even "quite
> > > rare" is too much.  "Never" is the only acceptable target.
> > 

> > actually the 4k stacks approach gives MORE breathing room for the problem
> > cases that are getting hit by our customers...
> 
> For the cases you described, yes.  For some others like nvidia, no.
> Not sure if we want to make things worse for some users in order to
> improve things for others (better paying ones?).  I want the seperate


You used the word "Never" and now you go away from it.... It wasn't Never,
and it will never be never if you want to include random binary only
modules. However in 2.4 for all intents and pruposes there was 4Kb already,
and now there still is, for user context. Because those interrupts DO
happen. NVidia was a walking timebomb, and with one function using 4Kb
that's an obvious Needs-Fix case. The kernel had a few of those in rare
drivers, most of which have been fixed by now. It'll never be never, but it
never was never either.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 12:53         ` Arjan van de Ven
@ 2004-05-26 13:00           ` Jörn Engel
  2004-05-26 13:05             ` Arjan van de Ven
  2004-06-07 18:14             ` Timothy Miller
  0 siblings, 2 replies; 39+ messages in thread
From: Jörn Engel @ 2004-05-26 13:00 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, Andrea Arcangeli, Rik van Riel, Linus Torvalds,
	linux-kernel

On Wed, 26 May 2004 14:53:00 +0200, Arjan van de Ven wrote:
> On Wed, May 26, 2004 at 02:50:14PM +0200, Jörn Engel wrote:
> > 
> > Experience indicates that for whatever reason, big stack consumers for
> > all three contexts never hit at the same time.  Big stack consumers
> > for one context happen too often, though.  "Too often" may be quite
> > rare, but considering the result of a stack overflow, even "quite
> > rare" is too much.  "Never" is the only acceptable target.
> 
> Actually it's not mever in 2.4. It does get here there by our customers once
> in a while. Esp with several NICs hitting an irq on the same CPU (eg the irq
> context goes over it's 2Kb limit)
> 
> > done, a stack overflow will merely cause a kernel panic.  Until then,
> > I am just as conservative as Andreas.
> 
> actually the 4k stacks approach gives MORE breathing room for the problem
> cases that are getting hit by our customers...

For the cases you described, yes.  For some others like nvidia, no.
Not sure if we want to make things worse for some users in order to
improve things for others (better paying ones?).  I want the seperate
interrupt stacks, sure.  I'm just not comfy with 4k per process yet.

But I'll shut up now and see if I can generate better data over the
weekend.  -test11 still had fun stuff like 3k stack consumption over
some code paths in a pretty minimal kernel.  Wonder what 2.6.6 will do
with allyesconfig. ;)

Jörn

-- 
He who knows that enough is enough will always have enough.
-- Lao Tsu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 12:50       ` Jörn Engel
@ 2004-05-26 12:53         ` Arjan van de Ven
  2004-05-26 13:00           ` Jörn Engel
  2004-05-26 18:12         ` David S. Miller
  1 sibling, 1 reply; 39+ messages in thread
From: Arjan van de Ven @ 2004-05-26 12:53 UTC (permalink / raw)
  To: Jörn Engel
  Cc: Ingo Molnar, Andrea Arcangeli, Rik van Riel, Linus Torvalds,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 846 bytes --]

On Wed, May 26, 2004 at 02:50:14PM +0200, Jörn Engel wrote:
> 
> Experience indicates that for whatever reason, big stack consumers for
> all three contexts never hit at the same time.  Big stack consumers
> for one context happen too often, though.  "Too often" may be quite
> rare, but considering the result of a stack overflow, even "quite
> rare" is too much.  "Never" is the only acceptable target.

Actually it's not mever in 2.4. It does get here there by our customers once
in a while. Esp with several NICs hitting an irq on the same CPU (eg the irq
context goes over it's 2Kb limit)

> done, a stack overflow will merely cause a kernel panic.  Until then,
> I am just as conservative as Andreas.

actually the 4k stacks approach gives MORE breathing room for the problem
cases that are getting hit by our customers...

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 4k stacks in 2.6
  2004-05-26 10:33     ` 4k stacks in 2.6 Ingo Molnar
@ 2004-05-26 12:50       ` Jörn Engel
  2004-05-26 12:53         ` Arjan van de Ven
  2004-05-26 18:12         ` David S. Miller
  0 siblings, 2 replies; 39+ messages in thread
From: Jörn Engel @ 2004-05-26 12:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrea Arcangeli, Rik van Riel, Linus Torvalds, Arjan van de Ven,
	linux-kernel

On Wed, 26 May 2004 12:33:03 +0200, Ingo Molnar wrote:
> * Andrea Arcangeli <andrea@suse.de> wrote:
> > On Tue, May 25, 2004 at 04:10:29PM -0400, Rik van Riel wrote:
> > > Fragmentation causes fork trouble (gone with the 4k stacks)
> > 
> > btw, the 4k stacks sounds not safe to me, most people only tested with
> > 8k stacks so far, I wouldn't make that change in a production tree
> > without an unstable cycle of testing in between. I'd rather risk a an
> > allocation failure than a stack memory corruption.
> 
> 4k stacks is a cool and useful feature and tons of effort that went into
> making them as safe as possible. Sure, we couldnt fix up bin-only
> modules, but all the kernel drivers are audited for stack footprint, and
> many months of beta testing has gone into this as well. Anyway, if you
> prefer you can turn on 8k stacks - especially if you tree has lots of
> not-yet-upstream driver patches.
> 
> > x86-64 has per-irq stacks that allowed to reduce the stack size to 8k
> > (which is very similar to 4k for an x86, but without per-irq stack
> > it's too risky).
> 
> do you realize that the 4K stacks feature also adds a separate softirq
> and a separate hardirq stack? So the maximum footprint is 4K+4K+4K, with
> a clear and sane limit for each type of context, while the 2.4 kernel
> has 6.5K for all 3 contexts combined. (Also, in 2.4 irq contexts pretty
> much assumed that there's 2K of stack for them - leaving a de-facto 4K
> stack for the process and softirq contexts.) So in fact there is more
> space in 2.6 for all, and i dont really understand your fears.

Experience indicates that for whatever reason, big stack consumers for
all three contexts never hit at the same time.  Big stack consumers
for one context happen too often, though.  "Too often" may be quite
rare, but considering the result of a stack overflow, even "quite
rare" is too much.  "Never" is the only acceptable target.

Change gcc to catch stack overflows before the fact and disallow
module load unless modules have those checks as well.  If that is
done, a stack overflow will merely cause a kernel panic.  Until then,
I am just as conservative as Andreas.

Jörn

-- 
And spam is a useful source of entropy for /dev/random too!
-- Jasmine Strong

^ permalink raw reply	[flat|nested] 39+ messages in thread

* 4k stacks in 2.6
  2004-05-25 21:15   ` Andrea Arcangeli
@ 2004-05-26 10:33     ` Ingo Molnar
  2004-05-26 12:50       ` Jörn Engel
  0 siblings, 1 reply; 39+ messages in thread
From: Ingo Molnar @ 2004-05-26 10:33 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Rik van Riel, Linus Torvalds, Arjan van de Ven, linux-kernel


* Andrea Arcangeli <andrea@suse.de> wrote:

> On Tue, May 25, 2004 at 04:10:29PM -0400, Rik van Riel wrote:
> > Fragmentation causes fork trouble (gone with the 4k stacks)
> 
> btw, the 4k stacks sounds not safe to me, most people only tested with
> 8k stacks so far, I wouldn't make that change in a production tree
> without an unstable cycle of testing in between. I'd rather risk a an
> allocation failure than a stack memory corruption.

4k stacks is a cool and useful feature and tons of effort that went into
making them as safe as possible. Sure, we couldnt fix up bin-only
modules, but all the kernel drivers are audited for stack footprint, and
many months of beta testing has gone into this as well. Anyway, if you
prefer you can turn on 8k stacks - especially if you tree has lots of
not-yet-upstream driver patches.

> x86-64 has per-irq stacks that allowed to reduce the stack size to 8k
> (which is very similar to 4k for an x86, but without per-irq stack
> it's too risky).

do you realize that the 4K stacks feature also adds a separate softirq
and a separate hardirq stack? So the maximum footprint is 4K+4K+4K, with
a clear and sane limit for each type of context, while the 2.4 kernel
has 6.5K for all 3 contexts combined. (Also, in 2.4 irq contexts pretty
much assumed that there's 2K of stack for them - leaving a de-facto 4K
stack for the process and softirq contexts.) So in fact there is more
space in 2.6 for all, and i dont really understand your fears.

	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2004-06-08  8:45 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1ZQpn-1Rx-1@gated-at.bofh.it>
     [not found] ` <1ZQz8-1Yh-15@gated-at.bofh.it>
     [not found]   ` <1ZRFf-2Vt-3@gated-at.bofh.it>
     [not found]     ` <203Zu-4aT-15@gated-at.bofh.it>
2004-05-26 13:57       ` 4k stacks in 2.6 Andi Kleen
2004-05-26 18:17         ` hch
2004-05-26 18:24           ` Andi Kleen
2004-05-26 20:39         ` Zwane Mwaikambo
     [not found]       ` <206b3-5WN-33@gated-at.bofh.it>
     [not found]         ` <20baw-1Lz-15@gated-at.bofh.it>
2004-05-26 19:32           ` Andi Kleen
2004-05-27 11:27             ` Jörn Engel
2004-05-27 13:49               ` Andrea Arcangeli
2004-05-27 14:15                 ` Jörn Engel
2004-05-27 14:49                   ` Andrea Arcangeli
2004-05-27 14:59                     ` Jörn Engel
2004-05-27 15:08                       ` Keith Owens
2004-05-27 15:21                         ` Jörn Engel
2004-05-27 15:34                           ` Arjan van de Ven
2004-05-27 15:46                             ` Jörn Engel
2004-06-01  5:25                             ` Jörn Engel
2004-05-26 15:17 Albert Cahalan
  -- strict thread matches above, loose matches on Subject: below --
2004-05-25 19:50 4g/4g for 2.6.6 Rik van Riel
2004-05-25 20:10 ` Rik van Riel
2004-05-25 21:15   ` Andrea Arcangeli
2004-05-26 10:33     ` 4k stacks in 2.6 Ingo Molnar
2004-05-26 12:50       ` Jörn Engel
2004-05-26 12:53         ` Arjan van de Ven
2004-05-26 13:00           ` Jörn Engel
2004-05-26 13:05             ` Arjan van de Ven
2004-05-26 16:41               ` Jörn Engel
2004-05-27 12:45                 ` Ingo Molnar
2004-05-27 13:59                   ` Andrea Arcangeli
2004-05-27 14:03                     ` Arjan van de Ven
2004-05-27 14:42                       ` Andrea Arcangeli
2004-06-02 19:40                         ` Bill Davidsen
2004-05-27 14:18                     ` Brian Gerst
2004-05-27 14:50                       ` Andrea Arcangeli
2004-05-27 14:55                         ` Linus Torvalds
2004-05-27 15:39                           ` Andrea Arcangeli
2004-05-27 18:31                           ` Guy Sotomayor
2004-05-27 19:26                             ` Brian Gerst
2004-06-07 18:14             ` Timothy Miller
2004-06-08  6:26               ` Arjan van de Ven
2004-06-08  8:45                 ` Jörn Engel
2004-05-26 18:12         ` David S. Miller
2004-05-26 19:02           ` Matt Mackall
2004-05-26 19:25             ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).