LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* GFP_ATOMIC page allocation failures.
@ 2008-04-01 23:56 Dave Jones
2008-04-02 1:28 ` Nick Piggin
0 siblings, 1 reply; 35+ messages in thread
From: Dave Jones @ 2008-04-01 23:56 UTC (permalink / raw)
To: Linux Kernel
I found a few ways to cause pages and pages of spew to dmesg
of the following form..
rhythmbox: page allocation failure. order:3, mode:0x4020
Pid: 4299, comm: rhythmbox Not tainted 2.6.25-0.172.rc7.git4.fc9.x86_64 #1
Call Trace:
<IRQ> [<ffffffff810862dc>] __alloc_pages+0x3a3/0x3c3
[<ffffffff812a58df>] ? trace_hardirqs_on_thunk+0x35/0x3a
[<ffffffff8109fd94>] alloc_pages_current+0x100/0x109
[<ffffffff810a6fd5>] new_slab+0x4a/0x249
[<ffffffff810a776a>] __slab_alloc+0x251/0x4e0
[<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
[<ffffffff810a8736>] __kmalloc_node_track_caller+0x8a/0xe2
[<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
[<ffffffff8121b5db>] __alloc_skb+0x6f/0x135
[<ffffffff8121c322>] __netdev_alloc_skb+0x31/0x4f
[<ffffffff8814e5b4>] :e1000e:e1000_alloc_rx_buffers+0xb7/0x1dc
[<ffffffff8814eada>] :e1000e:e1000_clean_rx_irq+0x271/0x307
[<ffffffff8814c71a>] :e1000e:e1000_clean+0x66/0x205
[<ffffffff8121eeb8>] net_rx_action+0xd9/0x20e
[<ffffffff81038757>] __do_softirq+0x70/0xf1
[<ffffffff8100d25c>] call_softirq+0x1c/0x28
[<ffffffff8100e485>] do_softirq+0x39/0x8a
[<ffffffff81038290>] irq_exit+0x4e/0x8f
[<ffffffff8100e781>] do_IRQ+0x145/0x167
[<ffffffff8100c5e6>] ret_from_intr+0x0/0xf
<EOI> [<ffffffff812a5ed8>] ? _spin_unlock_irqrestore+0x42/0x47
[<ffffffff8102a040>] ? __wake_up+0x43/0x50
[<ffffffff81056b7f>] ? wake_futex+0x47/0x53
[<ffffffff810584cf>] ? do_futex+0x697/0xc57
[<ffffffff8102fbc4>] ? hrtick_set+0xa1/0xfc
[<ffffffff81058b84>] ? sys_futex+0xf5/0x113
[<ffffffff810133e7>] ? syscall_trace_enter+0xb5/0xb9
[<ffffffff8100c1d0>] ? tracesys+0xd5/0xda
Given that we seem to recover from these events without negative effects
(ie, no apps get oom-killed), is there any value to actually flooding
syslog with this stuff ?
iirc, SuSE have been patching out these traces for GFP_ATOMIC allocations
for a long time. Should mainline do the same ?
Dave
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-01 23:56 GFP_ATOMIC page allocation failures Dave Jones
@ 2008-04-02 1:28 ` Nick Piggin
2008-04-02 1:35 ` Dave Jones
0 siblings, 1 reply; 35+ messages in thread
From: Nick Piggin @ 2008-04-02 1:28 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux Kernel
On Wednesday 02 April 2008 10:56, Dave Jones wrote:
> I found a few ways to cause pages and pages of spew to dmesg
> of the following form..
>
> rhythmbox: page allocation failure. order:3, mode:0x4020
> Pid: 4299, comm: rhythmbox Not tainted 2.6.25-0.172.rc7.git4.fc9.x86_64 #1
>
> Call Trace:
> <IRQ> [<ffffffff810862dc>] __alloc_pages+0x3a3/0x3c3
> [<ffffffff812a58df>] ? trace_hardirqs_on_thunk+0x35/0x3a
> [<ffffffff8109fd94>] alloc_pages_current+0x100/0x109
> [<ffffffff810a6fd5>] new_slab+0x4a/0x249
> [<ffffffff810a776a>] __slab_alloc+0x251/0x4e0
> [<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
> [<ffffffff810a8736>] __kmalloc_node_track_caller+0x8a/0xe2
> [<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
> [<ffffffff8121b5db>] __alloc_skb+0x6f/0x135
> [<ffffffff8121c322>] __netdev_alloc_skb+0x31/0x4f
> [<ffffffff8814e5b4>] :e1000e:e1000_alloc_rx_buffers+0xb7/0x1dc
> [<ffffffff8814eada>] :e1000e:e1000_clean_rx_irq+0x271/0x307
> [<ffffffff8814c71a>] :e1000e:e1000_clean+0x66/0x205
> [<ffffffff8121eeb8>] net_rx_action+0xd9/0x20e
> [<ffffffff81038757>] __do_softirq+0x70/0xf1
> [<ffffffff8100d25c>] call_softirq+0x1c/0x28
> [<ffffffff8100e485>] do_softirq+0x39/0x8a
> [<ffffffff81038290>] irq_exit+0x4e/0x8f
> [<ffffffff8100e781>] do_IRQ+0x145/0x167
> [<ffffffff8100c5e6>] ret_from_intr+0x0/0xf
> <EOI> [<ffffffff812a5ed8>] ? _spin_unlock_irqrestore+0x42/0x47
> [<ffffffff8102a040>] ? __wake_up+0x43/0x50
> [<ffffffff81056b7f>] ? wake_futex+0x47/0x53
> [<ffffffff810584cf>] ? do_futex+0x697/0xc57
> [<ffffffff8102fbc4>] ? hrtick_set+0xa1/0xfc
> [<ffffffff81058b84>] ? sys_futex+0xf5/0x113
> [<ffffffff810133e7>] ? syscall_trace_enter+0xb5/0xb9
> [<ffffffff8100c1d0>] ? tracesys+0xd5/0xda
>
>
> Given that we seem to recover from these events without negative effects
> (ie, no apps get oom-killed), is there any value to actually flooding
> syslog with this stuff ?
It's nice to have. Perhaps it could just be hardlimited to print
say 10 times, and maybe we could have a vmstat counter to keep
count after that.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 1:28 ` Nick Piggin
@ 2008-04-02 1:35 ` Dave Jones
2008-04-02 6:28 ` Chris Snook
0 siblings, 1 reply; 35+ messages in thread
From: Dave Jones @ 2008-04-02 1:35 UTC (permalink / raw)
To: Nick Piggin; +Cc: Linux Kernel
On Wed, Apr 02, 2008 at 12:28:16PM +1100, Nick Piggin wrote:
> On Wednesday 02 April 2008 10:56, Dave Jones wrote:
> > I found a few ways to cause pages and pages of spew to dmesg
> > of the following form..
> >
> > rhythmbox: page allocation failure. order:3, mode:0x4020
> > Pid: 4299, comm: rhythmbox Not tainted 2.6.25-0.172.rc7.git4.fc9.x86_64 #1
> >
> > Call Trace:
> > <IRQ> [<ffffffff810862dc>] __alloc_pages+0x3a3/0x3c3
> > [<ffffffff812a58df>] ? trace_hardirqs_on_thunk+0x35/0x3a
> > [<ffffffff8109fd94>] alloc_pages_current+0x100/0x109
> > [<ffffffff810a6fd5>] new_slab+0x4a/0x249
> > [<ffffffff810a776a>] __slab_alloc+0x251/0x4e0
> > [<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
> > [<ffffffff810a8736>] __kmalloc_node_track_caller+0x8a/0xe2
> > [<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
> > [<ffffffff8121b5db>] __alloc_skb+0x6f/0x135
> > [<ffffffff8121c322>] __netdev_alloc_skb+0x31/0x4f
> > [<ffffffff8814e5b4>] :e1000e:e1000_alloc_rx_buffers+0xb7/0x1dc
> > [<ffffffff8814eada>] :e1000e:e1000_clean_rx_irq+0x271/0x307
> > [<ffffffff8814c71a>] :e1000e:e1000_clean+0x66/0x205
> > [<ffffffff8121eeb8>] net_rx_action+0xd9/0x20e
> > [<ffffffff81038757>] __do_softirq+0x70/0xf1
> > [<ffffffff8100d25c>] call_softirq+0x1c/0x28
> > [<ffffffff8100e485>] do_softirq+0x39/0x8a
> > [<ffffffff81038290>] irq_exit+0x4e/0x8f
> > [<ffffffff8100e781>] do_IRQ+0x145/0x167
> > [<ffffffff8100c5e6>] ret_from_intr+0x0/0xf
> > <EOI> [<ffffffff812a5ed8>] ? _spin_unlock_irqrestore+0x42/0x47
> > [<ffffffff8102a040>] ? __wake_up+0x43/0x50
> > [<ffffffff81056b7f>] ? wake_futex+0x47/0x53
> > [<ffffffff810584cf>] ? do_futex+0x697/0xc57
> > [<ffffffff8102fbc4>] ? hrtick_set+0xa1/0xfc
> > [<ffffffff81058b84>] ? sys_futex+0xf5/0x113
> > [<ffffffff810133e7>] ? syscall_trace_enter+0xb5/0xb9
> > [<ffffffff8100c1d0>] ? tracesys+0xd5/0xda
> >
> > Given that we seem to recover from these events without negative effects
> > (ie, no apps get oom-killed), is there any value to actually flooding
> > syslog with this stuff ?
>
> It's nice to have. Perhaps it could just be hardlimited to print
> say 10 times, and maybe we could have a vmstat counter to keep
> count after that.
As an end-user, that's still 10 times too many.
What is anyone expect to do with these traces ?
multi-page atomic allocations fail sometimes, we shouldn't be
surprised by this. As long as the code that tries to do them
is aware of this, is there a problem ?
Dave
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 1:35 ` Dave Jones
@ 2008-04-02 6:28 ` Chris Snook
2008-04-02 7:56 ` Andrew Morton
0 siblings, 1 reply; 35+ messages in thread
From: Chris Snook @ 2008-04-02 6:28 UTC (permalink / raw)
To: Dave Jones, Nick Piggin, Linux Kernel
Dave Jones wrote:
> On Wed, Apr 02, 2008 at 12:28:16PM +1100, Nick Piggin wrote:
> > On Wednesday 02 April 2008 10:56, Dave Jones wrote:
> > > I found a few ways to cause pages and pages of spew to dmesg
> > > of the following form..
> > >
> > > rhythmbox: page allocation failure. order:3, mode:0x4020
> > > Pid: 4299, comm: rhythmbox Not tainted 2.6.25-0.172.rc7.git4.fc9.x86_64 #1
> > >
> > > Call Trace:
> > > <IRQ> [<ffffffff810862dc>] __alloc_pages+0x3a3/0x3c3
> > > [<ffffffff812a58df>] ? trace_hardirqs_on_thunk+0x35/0x3a
> > > [<ffffffff8109fd94>] alloc_pages_current+0x100/0x109
> > > [<ffffffff810a6fd5>] new_slab+0x4a/0x249
> > > [<ffffffff810a776a>] __slab_alloc+0x251/0x4e0
> > > [<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
> > > [<ffffffff810a8736>] __kmalloc_node_track_caller+0x8a/0xe2
> > > [<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
> > > [<ffffffff8121b5db>] __alloc_skb+0x6f/0x135
> > > [<ffffffff8121c322>] __netdev_alloc_skb+0x31/0x4f
> > > [<ffffffff8814e5b4>] :e1000e:e1000_alloc_rx_buffers+0xb7/0x1dc
> > > [<ffffffff8814eada>] :e1000e:e1000_clean_rx_irq+0x271/0x307
> > > [<ffffffff8814c71a>] :e1000e:e1000_clean+0x66/0x205
> > > [<ffffffff8121eeb8>] net_rx_action+0xd9/0x20e
> > > [<ffffffff81038757>] __do_softirq+0x70/0xf1
> > > [<ffffffff8100d25c>] call_softirq+0x1c/0x28
> > > [<ffffffff8100e485>] do_softirq+0x39/0x8a
> > > [<ffffffff81038290>] irq_exit+0x4e/0x8f
> > > [<ffffffff8100e781>] do_IRQ+0x145/0x167
> > > [<ffffffff8100c5e6>] ret_from_intr+0x0/0xf
> > > <EOI> [<ffffffff812a5ed8>] ? _spin_unlock_irqrestore+0x42/0x47
> > > [<ffffffff8102a040>] ? __wake_up+0x43/0x50
> > > [<ffffffff81056b7f>] ? wake_futex+0x47/0x53
> > > [<ffffffff810584cf>] ? do_futex+0x697/0xc57
> > > [<ffffffff8102fbc4>] ? hrtick_set+0xa1/0xfc
> > > [<ffffffff81058b84>] ? sys_futex+0xf5/0x113
> > > [<ffffffff810133e7>] ? syscall_trace_enter+0xb5/0xb9
> > > [<ffffffff8100c1d0>] ? tracesys+0xd5/0xda
> > >
> > > Given that we seem to recover from these events without negative effects
> > > (ie, no apps get oom-killed), is there any value to actually flooding
> > > syslog with this stuff ?
> >
> > It's nice to have. Perhaps it could just be hardlimited to print
> > say 10 times, and maybe we could have a vmstat counter to keep
> > count after that.
>
> As an end-user, that's still 10 times too many.
> What is anyone expect to do with these traces ?
>
> multi-page atomic allocations fail sometimes, we shouldn't be
> surprised by this. As long as the code that tries to do them
> is aware of this, is there a problem ?
>
> Dave
>
I agree that this spew is quite excessive, but it's there for a reason.
Some code does *not* handle this failure gracefully, and may put the
machine in a state where it is subsequently unable to report/log errors
from the calling code. If that happens, I'd like to see some sort of
dying gasp.
Limiting this to once per boot should suffice for debugging purposes.
Even if you manage to concoct a bug that always survives the first
failure, you should be able to take the hint when you keep seeing this
in dmesg.
-- Chris
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 6:28 ` Chris Snook
@ 2008-04-02 7:56 ` Andrew Morton
2008-04-02 8:17 ` KOSAKI Motohiro
` (2 more replies)
0 siblings, 3 replies; 35+ messages in thread
From: Andrew Morton @ 2008-04-02 7:56 UTC (permalink / raw)
To: Chris Snook; +Cc: Dave Jones, Nick Piggin, Linux Kernel
On Wed, 02 Apr 2008 02:28:25 -0400 Chris Snook <csnook@redhat.com> wrote:
> Dave Jones wrote:
> > On Wed, Apr 02, 2008 at 12:28:16PM +1100, Nick Piggin wrote:
> > > On Wednesday 02 April 2008 10:56, Dave Jones wrote:
> > > > I found a few ways to cause pages and pages of spew to dmesg
> > > > of the following form..
> > > >
> > > > rhythmbox: page allocation failure. order:3, mode:0x4020
> > > > Pid: 4299, comm: rhythmbox Not tainted 2.6.25-0.172.rc7.git4.fc9.x86_64 #1
> > > >
> > > > Call Trace:
> > > > <IRQ> [<ffffffff810862dc>] __alloc_pages+0x3a3/0x3c3
> > > > [<ffffffff812a58df>] ? trace_hardirqs_on_thunk+0x35/0x3a
> > > > [<ffffffff8109fd94>] alloc_pages_current+0x100/0x109
> > > > [<ffffffff810a6fd5>] new_slab+0x4a/0x249
> > > > [<ffffffff810a776a>] __slab_alloc+0x251/0x4e0
> > > > [<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
> > > > [<ffffffff810a8736>] __kmalloc_node_track_caller+0x8a/0xe2
> > > > [<ffffffff8121c322>] ? __netdev_alloc_skb+0x31/0x4f
> > > > [<ffffffff8121b5db>] __alloc_skb+0x6f/0x135
> > > > [<ffffffff8121c322>] __netdev_alloc_skb+0x31/0x4f
> > > > [<ffffffff8814e5b4>] :e1000e:e1000_alloc_rx_buffers+0xb7/0x1dc
> > > > [<ffffffff8814eada>] :e1000e:e1000_clean_rx_irq+0x271/0x307
> > > > [<ffffffff8814c71a>] :e1000e:e1000_clean+0x66/0x205
> > > > [<ffffffff8121eeb8>] net_rx_action+0xd9/0x20e
> > > > [<ffffffff81038757>] __do_softirq+0x70/0xf1
> > > > [<ffffffff8100d25c>] call_softirq+0x1c/0x28
> > > > [<ffffffff8100e485>] do_softirq+0x39/0x8a
> > > > [<ffffffff81038290>] irq_exit+0x4e/0x8f
> > > > [<ffffffff8100e781>] do_IRQ+0x145/0x167
> > > > [<ffffffff8100c5e6>] ret_from_intr+0x0/0xf
> > > > <EOI> [<ffffffff812a5ed8>] ? _spin_unlock_irqrestore+0x42/0x47
> > > > [<ffffffff8102a040>] ? __wake_up+0x43/0x50
> > > > [<ffffffff81056b7f>] ? wake_futex+0x47/0x53
> > > > [<ffffffff810584cf>] ? do_futex+0x697/0xc57
> > > > [<ffffffff8102fbc4>] ? hrtick_set+0xa1/0xfc
> > > > [<ffffffff81058b84>] ? sys_futex+0xf5/0x113
> > > > [<ffffffff810133e7>] ? syscall_trace_enter+0xb5/0xb9
> > > > [<ffffffff8100c1d0>] ? tracesys+0xd5/0xda
> > > >
> > > > Given that we seem to recover from these events without negative effects
> > > > (ie, no apps get oom-killed), is there any value to actually flooding
> > > > syslog with this stuff ?
> > >
> > > It's nice to have. Perhaps it could just be hardlimited to print
> > > say 10 times, and maybe we could have a vmstat counter to keep
> > > count after that.
> >
> > As an end-user, that's still 10 times too many.
> > What is anyone expect to do with these traces ?
> >
> > multi-page atomic allocations fail sometimes, we shouldn't be
> > surprised by this. As long as the code that tries to do them
> > is aware of this, is there a problem ?
> >
> > Dave
> >
>
> I agree that this spew is quite excessive, but it's there for a reason.
> Some code does *not* handle this failure gracefully, and may put the
> machine in a state where it is subsequently unable to report/log errors
> from the calling code. If that happens, I'd like to see some sort of
> dying gasp.
>
> Limiting this to once per boot should suffice for debugging purposes.
> Even if you manage to concoct a bug that always survives the first
> failure, you should be able to take the hint when you keep seeing this
> in dmesg.
The appropriate thing to do here is to convert known-good drivers (such as
e1000[e]) to use __GFP_NOWARN.
Unfortunately netdev_alloc_skb() went and assumed GFP_ATOMIC, but I guess
we can dive below the covers and use __netdev_alloc_skb():
From: Andrew Morton <akpm@linux-foundation.org>
We get rather a lot of reports of page allocation warnings coming out of
e1000. But this driver is know to handle them properly so let's suppress
them.
Cc: Auke Kok <auke-jan.h.kok@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/net/e1000/e1000.h | 4 ++++
drivers/net/e1000/e1000_main.c | 10 +++++-----
2 files changed, 9 insertions(+), 5 deletions(-)
diff -puN drivers/net/e1000/e1000_main.c~e1000-suppress-page-allocation-failure-warnings drivers/net/e1000/e1000_main.c
--- a/drivers/net/e1000/e1000_main.c~e1000-suppress-page-allocation-failure-warnings
+++ a/drivers/net/e1000/e1000_main.c
@@ -4296,7 +4296,7 @@ e1000_clean_rx_irq(struct e1000_adapter
* of reassembly being done in the stack */
if (length < copybreak) {
struct sk_buff *new_skb =
- netdev_alloc_skb(netdev, length + NET_IP_ALIGN);
+ e1000_alloc_skb(netdev, length + NET_IP_ALIGN);
if (new_skb) {
skb_reserve(new_skb, NET_IP_ALIGN);
skb_copy_to_linear_data_offset(new_skb,
@@ -4585,7 +4585,7 @@ e1000_alloc_rx_buffers(struct e1000_adap
goto map_skb;
}
- skb = netdev_alloc_skb(netdev, bufsz);
+ skb = e1000_alloc_skb(netdev, bufsz);
if (unlikely(!skb)) {
/* Better luck next round */
adapter->alloc_rx_buff_failed++;
@@ -4598,7 +4598,7 @@ e1000_alloc_rx_buffers(struct e1000_adap
DPRINTK(RX_ERR, ERR, "skb align check failed: %u bytes "
"at %p\n", bufsz, skb->data);
/* Try again, without freeing the previous */
- skb = netdev_alloc_skb(netdev, bufsz);
+ skb = e1000_alloc_skb(netdev, bufsz);
/* Failed allocation, critical failure */
if (!skb) {
dev_kfree_skb(oldskb);
@@ -4720,8 +4720,8 @@ e1000_alloc_rx_buffers_ps(struct e1000_a
rx_desc->read.buffer_addr[j+1] = ~cpu_to_le64(0);
}
- skb = netdev_alloc_skb(netdev,
- adapter->rx_ps_bsize0 + NET_IP_ALIGN);
+ skb = e1000_alloc_skb(netdev,
+ adapter->rx_ps_bsize0 + NET_IP_ALIGN);
if (unlikely(!skb)) {
adapter->alloc_rx_buff_failed++;
diff -puN drivers/net/e1000/e1000.h~e1000-suppress-page-allocation-failure-warnings drivers/net/e1000/e1000.h
--- a/drivers/net/e1000/e1000.h~e1000-suppress-page-allocation-failure-warnings
+++ a/drivers/net/e1000/e1000.h
@@ -358,5 +358,9 @@ extern void e1000_power_up_phy(struct e1
extern void e1000_set_ethtool_ops(struct net_device *netdev);
extern void e1000_check_options(struct e1000_adapter *adapter);
+static inline void *e1000_alloc_skb(struct net_device *dev, unsigned int length)
+{
+ return __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN);
+}
#endif /* _E1000_H_ */
_
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 7:56 ` Andrew Morton
@ 2008-04-02 8:17 ` KOSAKI Motohiro
2008-04-02 8:24 ` David Miller
2008-04-02 9:12 ` Nick Piggin
2008-04-02 17:21 ` Jeff Garzik
2 siblings, 1 reply; 35+ messages in thread
From: KOSAKI Motohiro @ 2008-04-02 8:17 UTC (permalink / raw)
To: Andrew Morton
Cc: kosaki.motohiro, Chris Snook, Dave Jones, Nick Piggin,
Linux Kernel, netdev
Hi
CC'ed netdev
> The appropriate thing to do here is to convert known-good drivers (such as
> e1000[e]) to use __GFP_NOWARN.
>
> Unfortunately netdev_alloc_skb() went and assumed GFP_ATOMIC, but I guess
> we can dive below the covers and use __netdev_alloc_skb():
if network guys hope known-good driver should call
__netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN) instead netdev_alloc_skb,
I think we should make netdev_alloc_skb_nowarn.
static inline void *netdev_alloc_skb_nowarn(struct net_device *dev, unsigned int length)
{
return __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN);
}
if not, we make unnecessary various alloc_skb
(e.g. e1000_alloc_skb, tg3_alloc_skb, 3com_alloc_skb etc..)
> From: Andrew Morton <akpm@linux-foundation.org>
>
> We get rather a lot of reports of page allocation warnings coming out of
> e1000. But this driver is know to handle them properly so let's suppress
> them.
>
> Cc: Auke Kok <auke-jan.h.kok@intel.com>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: Jeff Garzik <jeff@garzik.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> drivers/net/e1000/e1000.h | 4 ++++
> drivers/net/e1000/e1000_main.c | 10 +++++-----
> 2 files changed, 9 insertions(+), 5 deletions(-)
>
> diff -puN drivers/net/e1000/e1000_main.c~e1000-suppress-page-allocation-failure-warnings drivers/net/e1000/e1000_main.c
> --- a/drivers/net/e1000/e1000_main.c~e1000-suppress-page-allocation-failure-warnings
> +++ a/drivers/net/e1000/e1000_main.c
> @@ -4296,7 +4296,7 @@ e1000_clean_rx_irq(struct e1000_adapter
> * of reassembly being done in the stack */
> if (length < copybreak) {
> struct sk_buff *new_skb =
> - netdev_alloc_skb(netdev, length + NET_IP_ALIGN);
> + e1000_alloc_skb(netdev, length + NET_IP_ALIGN);
> if (new_skb) {
> skb_reserve(new_skb, NET_IP_ALIGN);
> skb_copy_to_linear_data_offset(new_skb,
> @@ -4585,7 +4585,7 @@ e1000_alloc_rx_buffers(struct e1000_adap
> goto map_skb;
> }
>
> - skb = netdev_alloc_skb(netdev, bufsz);
> + skb = e1000_alloc_skb(netdev, bufsz);
> if (unlikely(!skb)) {
> /* Better luck next round */
> adapter->alloc_rx_buff_failed++;
> @@ -4598,7 +4598,7 @@ e1000_alloc_rx_buffers(struct e1000_adap
> DPRINTK(RX_ERR, ERR, "skb align check failed: %u bytes "
> "at %p\n", bufsz, skb->data);
> /* Try again, without freeing the previous */
> - skb = netdev_alloc_skb(netdev, bufsz);
> + skb = e1000_alloc_skb(netdev, bufsz);
> /* Failed allocation, critical failure */
> if (!skb) {
> dev_kfree_skb(oldskb);
> @@ -4720,8 +4720,8 @@ e1000_alloc_rx_buffers_ps(struct e1000_a
> rx_desc->read.buffer_addr[j+1] = ~cpu_to_le64(0);
> }
>
> - skb = netdev_alloc_skb(netdev,
> - adapter->rx_ps_bsize0 + NET_IP_ALIGN);
> + skb = e1000_alloc_skb(netdev,
> + adapter->rx_ps_bsize0 + NET_IP_ALIGN);
>
> if (unlikely(!skb)) {
> adapter->alloc_rx_buff_failed++;
> diff -puN drivers/net/e1000/e1000.h~e1000-suppress-page-allocation-failure-warnings drivers/net/e1000/e1000.h
> --- a/drivers/net/e1000/e1000.h~e1000-suppress-page-allocation-failure-warnings
> +++ a/drivers/net/e1000/e1000.h
> @@ -358,5 +358,9 @@ extern void e1000_power_up_phy(struct e1
> extern void e1000_set_ethtool_ops(struct net_device *netdev);
> extern void e1000_check_options(struct e1000_adapter *adapter);
>
> +static inline void *e1000_alloc_skb(struct net_device *dev, unsigned int length)
> +{
> + return __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN);
> +}
>
> #endif /* _E1000_H_ */
> _
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 8:17 ` KOSAKI Motohiro
@ 2008-04-02 8:24 ` David Miller
2008-04-02 8:43 ` Andrew Morton
` (2 more replies)
0 siblings, 3 replies; 35+ messages in thread
From: David Miller @ 2008-04-02 8:24 UTC (permalink / raw)
To: kosaki.motohiro; +Cc: akpm, csnook, davej, nickpiggin, linux-kernel, netdev
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Wed, 02 Apr 2008 17:17:50 +0900
> if network guys hope known-good driver should call
> __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN) instead netdev_alloc_skb,
> I think we should make netdev_alloc_skb_nowarn.
Giving it a proper name like this takes away the indication that this
situation is very special.
Two leading underscores to an interface means "something special and
unusual requiring more careful consideration than usual is occuring
here." netdev_alloc_skb_nowarn() on the other hand, does not
convey this meaning.
And we will have very few drivers that use this construct, thus it
really is best to handle things the way Andrew has.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 8:24 ` David Miller
@ 2008-04-02 8:43 ` Andrew Morton
2008-04-02 10:00 ` KOSAKI Motohiro
2008-04-02 11:04 ` Peter Zijlstra
2 siblings, 0 replies; 35+ messages in thread
From: Andrew Morton @ 2008-04-02 8:43 UTC (permalink / raw)
To: David Miller
Cc: kosaki.motohiro, csnook, davej, nickpiggin, linux-kernel, netdev
On Wed, 02 Apr 2008 01:24:06 -0700 (PDT) David Miller <davem@davemloft.net> wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Wed, 02 Apr 2008 17:17:50 +0900
>
> > if network guys hope known-good driver should call
> > __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN) instead netdev_alloc_skb,
> > I think we should make netdev_alloc_skb_nowarn.
>
> Giving it a proper name like this takes away the indication that this
> situation is very special.
>
> Two leading underscores to an interface means "something special and
> unusual requiring more careful consideration than usual is occuring
> here." netdev_alloc_skb_nowarn() on the other hand, does not
> convey this meaning.
>
> And we will have very few drivers that use this construct, thus it
> really is best to handle things the way Andrew has.
Oh. I just went and redid thing the other way. Picked the same name too.
Whatever.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 7:56 ` Andrew Morton
2008-04-02 8:17 ` KOSAKI Motohiro
@ 2008-04-02 9:12 ` Nick Piggin
2008-04-02 15:54 ` Andrew Morton
2008-04-02 17:21 ` Jeff Garzik
2 siblings, 1 reply; 35+ messages in thread
From: Nick Piggin @ 2008-04-02 9:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Chris Snook, Dave Jones, Linux Kernel
On Wednesday 02 April 2008 18:56, Andrew Morton wrote:
> > Limiting this to once per boot should suffice for debugging purposes.
> > Even if you manage to concoct a bug that always survives the first
> > failure, you should be able to take the hint when you keep seeing this
> > in dmesg.
>
> The appropriate thing to do here is to convert known-good drivers (such as
> e1000[e]) to use __GFP_NOWARN.
>
> Unfortunately netdev_alloc_skb() went and assumed GFP_ATOMIC, but I guess
> we can dive below the covers and use __netdev_alloc_skb():
It's still actually nice to know how often it is happening even for
these known good sites because too much can indicate a problem and
that you could actually bring performance up by tuning some things.
So I think that the messages should stay, and they should print out
some header to say that it is only a warning and if not happening
too often then it is not a problem, and if it is continually
happening then please try X or Y or post a message to lkml...
>
>
>
> From: Andrew Morton <akpm@linux-foundation.org>
>
> We get rather a lot of reports of page allocation warnings coming out of
> e1000. But this driver is know to handle them properly so let's suppress
> them.
>
> Cc: Auke Kok <auke-jan.h.kok@intel.com>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: Jeff Garzik <jeff@garzik.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> drivers/net/e1000/e1000.h | 4 ++++
> drivers/net/e1000/e1000_main.c | 10 +++++-----
> 2 files changed, 9 insertions(+), 5 deletions(-)
>
> diff -puN
> drivers/net/e1000/e1000_main.c~e1000-suppress-page-allocation-failure-warni
>ngs drivers/net/e1000/e1000_main.c ---
> a/drivers/net/e1000/e1000_main.c~e1000-suppress-page-allocation-failure-war
>nings +++ a/drivers/net/e1000/e1000_main.c
> @@ -4296,7 +4296,7 @@ e1000_clean_rx_irq(struct e1000_adapter
> * of reassembly being done in the stack */
> if (length < copybreak) {
> struct sk_buff *new_skb =
> - netdev_alloc_skb(netdev, length + NET_IP_ALIGN);
> + e1000_alloc_skb(netdev, length + NET_IP_ALIGN);
> if (new_skb) {
> skb_reserve(new_skb, NET_IP_ALIGN);
> skb_copy_to_linear_data_offset(new_skb,
> @@ -4585,7 +4585,7 @@ e1000_alloc_rx_buffers(struct e1000_adap
> goto map_skb;
> }
>
> - skb = netdev_alloc_skb(netdev, bufsz);
> + skb = e1000_alloc_skb(netdev, bufsz);
> if (unlikely(!skb)) {
> /* Better luck next round */
> adapter->alloc_rx_buff_failed++;
> @@ -4598,7 +4598,7 @@ e1000_alloc_rx_buffers(struct e1000_adap
> DPRINTK(RX_ERR, ERR, "skb align check failed: %u bytes "
> "at %p\n", bufsz, skb->data);
> /* Try again, without freeing the previous */
> - skb = netdev_alloc_skb(netdev, bufsz);
> + skb = e1000_alloc_skb(netdev, bufsz);
> /* Failed allocation, critical failure */
> if (!skb) {
> dev_kfree_skb(oldskb);
> @@ -4720,8 +4720,8 @@ e1000_alloc_rx_buffers_ps(struct e1000_a
> rx_desc->read.buffer_addr[j+1] = ~cpu_to_le64(0);
> }
>
> - skb = netdev_alloc_skb(netdev,
> - adapter->rx_ps_bsize0 + NET_IP_ALIGN);
> + skb = e1000_alloc_skb(netdev,
> + adapter->rx_ps_bsize0 + NET_IP_ALIGN);
>
> if (unlikely(!skb)) {
> adapter->alloc_rx_buff_failed++;
> diff -puN
> drivers/net/e1000/e1000.h~e1000-suppress-page-allocation-failure-warnings
> drivers/net/e1000/e1000.h ---
> a/drivers/net/e1000/e1000.h~e1000-suppress-page-allocation-failure-warnings
> +++ a/drivers/net/e1000/e1000.h
> @@ -358,5 +358,9 @@ extern void e1000_power_up_phy(struct e1
> extern void e1000_set_ethtool_ops(struct net_device *netdev);
> extern void e1000_check_options(struct e1000_adapter *adapter);
>
> +static inline void *e1000_alloc_skb(struct net_device *dev, unsigned int
> length) +{
> + return __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN);
> +}
>
> #endif /* _E1000_H_ */
> _
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 8:24 ` David Miller
2008-04-02 8:43 ` Andrew Morton
@ 2008-04-02 10:00 ` KOSAKI Motohiro
2008-04-02 10:56 ` KOSAKI Motohiro
2008-04-02 11:04 ` Peter Zijlstra
2 siblings, 1 reply; 35+ messages in thread
From: KOSAKI Motohiro @ 2008-04-02 10:00 UTC (permalink / raw)
To: David Miller
Cc: kosaki.motohiro, akpm, csnook, davej, nickpiggin, linux-kernel,
netdev, Michael Chan
Hi
> > if network guys hope known-good driver should call
> > __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN) instead netdev_alloc_skb,
> > I think we should make netdev_alloc_skb_nowarn.
>
> Giving it a proper name like this takes away the indication that this
> situation is very special.
Hmmm,
at least, tg3 driver often output the same message on my test environment.
(because I often run stress test ;)
I doubt almost driver has the same problem under heavy workload.
and I think tg3 is known-good driver too.
because its alloc failure doesn't cause any bad thing at all.
Do you favorite the following patch?
---------------------------------------------------------------------
We get rather a lot of reports of page allocation warnings coming out of
tg3. But this driver is know to handle them properly so let's suppress
them.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: David Miller <davem@davemloft.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
Index: b/drivers/net/tg3.c
===================================================================
--- a/drivers/net/tg3.c 2008-03-14 21:38:12.000000000 +0900
+++ b/drivers/net/tg3.c 2008-04-02 18:54:15.000000000 +0900
@@ -307,6 +307,11 @@ static const struct {
{ "interrupt test (offline)" },
};
+static inline void *tg3_alloc_skb(struct net_device *dev, unsigned int length)
+{
+ return __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN);
+}
+
static void tg3_write32(struct tg3 *tp, u32 off, u32 val)
{
writel(val, tp->regs + off);
@@ -3437,7 +3442,7 @@ static int tg3_alloc_rx_skb(struct tg3 *
* Callers depend upon this behavior and assume that
* we leave everything unchanged if we fail.
*/
- skb = netdev_alloc_skb(tp->dev, skb_size);
+ skb = tg3_alloc_skb(tp->dev, skb_size);
if (skb == NULL)
return -ENOMEM;
@@ -3609,7 +3614,7 @@ static int tg3_rx(struct tg3 *tp, int bu
tg3_recycle_rx(tp, opaque_key,
desc_idx, *post_ptr);
- copy_skb = netdev_alloc_skb(tp->dev, len + 2);
+ copy_skb = tg3_alloc_skb(tp->dev, len + 2);
if (copy_skb == NULL)
goto drop_it_no_recycle;
@@ -9370,7 +9375,7 @@ static int tg3_run_loopback(struct tg3 *
err = -EIO;
tx_len = 1514;
- skb = netdev_alloc_skb(tp->dev, tx_len);
+ skb = tg3_alloc_skb(tp->dev, tx_len);
if (!skb)
return -ENOMEM;
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 10:00 ` KOSAKI Motohiro
@ 2008-04-02 10:56 ` KOSAKI Motohiro
2008-04-02 18:44 ` David Miller
0 siblings, 1 reply; 35+ messages in thread
From: KOSAKI Motohiro @ 2008-04-02 10:56 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: kosaki.motohiro, David Miller, akpm, csnook, davej, nickpiggin,
linux-kernel, netdev, Michael Chan
> ---------------------------------------------------------------------
> We get rather a lot of reports of page allocation warnings coming out of
> tg3. But this driver is know to handle them properly so let's suppress
> them.
BTW:
other person's tg3 allocation failure bug report.
http://bugzilla.kernel.org/show_bug.cgi?id=9771
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 8:24 ` David Miller
2008-04-02 8:43 ` Andrew Morton
2008-04-02 10:00 ` KOSAKI Motohiro
@ 2008-04-02 11:04 ` Peter Zijlstra
2008-04-02 18:45 ` David Miller
2 siblings, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2008-04-02 11:04 UTC (permalink / raw)
To: David Miller
Cc: kosaki.motohiro, akpm, csnook, davej, nickpiggin, linux-kernel, netdev
On Wed, 2008-04-02 at 01:24 -0700, David Miller wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Wed, 02 Apr 2008 17:17:50 +0900
>
> > if network guys hope known-good driver should call
> > __netdev_alloc_skb(dev, length, GFP_ATOMIC|__GFP_NOWARN) instead netdev_alloc_skb,
> > I think we should make netdev_alloc_skb_nowarn.
>
> Giving it a proper name like this takes away the indication that this
> situation is very special.
>
> Two leading underscores to an interface means "something special and
> unusual requiring more careful consideration than usual is occuring
> here." netdev_alloc_skb_nowarn() on the other hand, does not
> convey this meaning.
>
> And we will have very few drivers that use this construct, thus it
> really is best to handle things the way Andrew has.
Would we not hope that most net drivers can handle {,net}dev_alloc_skb()
failing? Otherwise we have some serious trouble.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 9:12 ` Nick Piggin
@ 2008-04-02 15:54 ` Andrew Morton
2008-04-03 5:22 ` Nick Piggin
0 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2008-04-02 15:54 UTC (permalink / raw)
To: Nick Piggin; +Cc: Chris Snook, Dave Jones, Linux Kernel, netdev, Peter Zijlstra
On Wed, 2 Apr 2008 20:12:58 +1100 Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> On Wednesday 02 April 2008 18:56, Andrew Morton wrote:
>
> > > Limiting this to once per boot should suffice for debugging purposes.
> > > Even if you manage to concoct a bug that always survives the first
> > > failure, you should be able to take the hint when you keep seeing this
> > > in dmesg.
> >
> > The appropriate thing to do here is to convert known-good drivers (such as
> > e1000[e]) to use __GFP_NOWARN.
> >
> > Unfortunately netdev_alloc_skb() went and assumed GFP_ATOMIC, but I guess
> > we can dive below the covers and use __netdev_alloc_skb():
>
> It's still actually nice to know how often it is happening even for
> these known good sites because too much can indicate a problem and
> that you could actually bring performance up by tuning some things.
Yes, it's useful debugging. It tells us when we mucked up the page
allocator.
It also tells us when we mucked up the net driver - I doubt if we (or at
least, I) would have discovered that e1000 does a 32k allocation for a 5k(?)
frame if this warning wasn't coming out.
> So I think that the messages should stay, and they should print out
> some header to say that it is only a warning and if not happening
> too often then it is not a problem, and if it is continually
> happening then please try X or Y or post a message to lkml...
Yes, I suppose so.
hm, tricky.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 7:56 ` Andrew Morton
2008-04-02 8:17 ` KOSAKI Motohiro
2008-04-02 9:12 ` Nick Piggin
@ 2008-04-02 17:21 ` Jeff Garzik
2008-04-02 17:33 ` Andrew Morton
2 siblings, 1 reply; 35+ messages in thread
From: Jeff Garzik @ 2008-04-02 17:21 UTC (permalink / raw)
To: Andrew Morton
Cc: Chris Snook, Dave Jones, Nick Piggin, Linux Kernel, NetDev,
David Miller, Linus Torvalds
Andrew Morton wrote:
> The appropriate thing to do here is to convert known-good drivers (such as
> e1000[e]) to use __GFP_NOWARN.
>
> Unfortunately netdev_alloc_skb() went and assumed GFP_ATOMIC, but I guess
> we can dive below the covers and use __netdev_alloc_skb():
>
>
>
> From: Andrew Morton <akpm@linux-foundation.org>
>
> We get rather a lot of reports of page allocation warnings coming out of
> e1000. But this driver is know to handle them properly so let's suppress
> them.
Do you people hear what you're saying???
I respectfully but strongly disagree with this.
We do __not__ need a whitelist (__GFP_NOWARN) of drivers that handle
allocation failures properly. That's a long list, a maintenance
nightmare, and it is punishing good behavior.
It has been true for over a decade that allocations should be checked
for NULL, and GFP_ATOMIC allocations MUST be checked for NULL.
Let's not crap all over good drivers, because a few bad apples don't
have the proper checks.
Or at the very least, this TOTALLY BOGUS spew from working drivers
should not be foisted upon users. Every time a working driver complains
about this -- as in the examples here -- the value of the warning
decreases to noise.
And the solution to noise is not _more noise_ (adding 'nowarn' to every
damn driver in the kernel).
Jeff
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 17:21 ` Jeff Garzik
@ 2008-04-02 17:33 ` Andrew Morton
2008-04-02 18:18 ` Jeff Garzik
0 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2008-04-02 17:33 UTC (permalink / raw)
To: Jeff Garzik
Cc: Chris Snook, Dave Jones, Nick Piggin, Linux Kernel, NetDev,
David Miller, Linus Torvalds
On Wed, 02 Apr 2008 13:21:44 -0400 Jeff Garzik <jeff@garzik.org> wrote:
> Andrew Morton wrote:
> > The appropriate thing to do here is to convert known-good drivers (such as
> > e1000[e]) to use __GFP_NOWARN.
> >
> > Unfortunately netdev_alloc_skb() went and assumed GFP_ATOMIC, but I guess
> > we can dive below the covers and use __netdev_alloc_skb():
> >
> >
> >
> > From: Andrew Morton <akpm@linux-foundation.org>
> >
> > We get rather a lot of reports of page allocation warnings coming out of
> > e1000. But this driver is know to handle them properly so let's suppress
> > them.
>
>
> Do you people hear what you're saying???
>
> I respectfully but strongly disagree with this.
>
> We do __not__ need a whitelist (__GFP_NOWARN) of drivers that handle
> allocation failures properly. That's a long list, a maintenance
> nightmare, and it is punishing good behavior.
>
> It has been true for over a decade that allocations should be checked
> for NULL, and GFP_ATOMIC allocations MUST be checked for NULL.
>
> Let's not crap all over good drivers, because a few bad apples don't
> have the proper checks.
>
> Or at the very least, this TOTALLY BOGUS spew from working drivers
> should not be foisted upon users. Every time a working driver complains
> about this -- as in the examples here -- the value of the warning
> decreases to noise.
>
> And the solution to noise is not _more noise_ (adding 'nowarn' to every
> damn driver in the kernel).
>
After you've read Nick's comments (which I pray you have not), and after
you've convinced us and yourself of their wrongness, you might like to
consider adding a __GFP_NOWARN to netdev_alloc_skb().
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 17:33 ` Andrew Morton
@ 2008-04-02 18:18 ` Jeff Garzik
2008-04-02 18:37 ` Kok, Auke
2008-04-03 5:57 ` Nick Piggin
0 siblings, 2 replies; 35+ messages in thread
From: Jeff Garzik @ 2008-04-02 18:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Chris Snook, Dave Jones, Nick Piggin, Linux Kernel, NetDev,
David Miller, Linus Torvalds
Andrew Morton wrote:
> After you've read Nick's comments (which I pray you have not), and after
> you've convinced us and yourself of their wrongness, you might like to
> consider adding a __GFP_NOWARN to netdev_alloc_skb().
Already done so. Adding __GFP_NOWARN to netdev_alloc_skb() is wrong
for several reasons.
It doesn't change the underlying conditions.
It doesn't fix the desire to stamp other drivers in this manner.
And most importantly, it is not even correct: the handling of the
allocation failure remains delegated to the netdev_alloc_skb() users,
which may or may not be properly handling allocation failures.
Put simply, you don't know if the caller is stupid or smart. And there
are a _lot_ of callers, do you really want to flag all of them?
Many modern net drivers are smart, and quite gracefully handle
allocation failure without skipping a beat.
But some are really dumb, and leave big holes in their DMA rings when
allocations fail.
The warnings are valid _sometimes_, but not for others. So adding
__GFP_NOWARN to netdev_alloc_skb() unconditionally makes no sense,
except as an admission that the "spew when there is memory pressure"
idea was silly.
Turning to Nick's comment,
> It's still actually nice to know how often it is happening even for
> these known good sites because too much can indicate a problem and
> that you could actually bring performance up by tuning some things.
then create a counter or acculuation buffer somewhere.
We don't need spew every time there is memory pressure of this magnitude.
IMO there are much better ways than printk(), to inform tasks, and
humans, of allocation failures.
Jeff
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 18:18 ` Jeff Garzik
@ 2008-04-02 18:37 ` Kok, Auke
2008-04-03 5:57 ` Nick Piggin
1 sibling, 0 replies; 35+ messages in thread
From: Kok, Auke @ 2008-04-02 18:37 UTC (permalink / raw)
To: Jeff Garzik
Cc: Andrew Morton, Chris Snook, Dave Jones, Nick Piggin,
Linux Kernel, NetDev, David Miller, Linus Torvalds
Jeff Garzik wrote:
> Andrew Morton wrote:
>> After you've read Nick's comments (which I pray you have not), and after
>> you've convinced us and yourself of their wrongness, you might like to
>> consider adding a __GFP_NOWARN to netdev_alloc_skb().
>
> Already done so. Adding __GFP_NOWARN to netdev_alloc_skb() is wrong
> for several reasons.
>
> It doesn't change the underlying conditions.
> It doesn't fix the desire to stamp other drivers in this manner.
>
> And most importantly, it is not even correct: the handling of the
> allocation failure remains delegated to the netdev_alloc_skb() users,
> which may or may not be properly handling allocation failures.
>
> Put simply, you don't know if the caller is stupid or smart. And there
> are a _lot_ of callers, do you really want to flag all of them?
>
> Many modern net drivers are smart, and quite gracefully handle
> allocation failure without skipping a beat.
>
> But some are really dumb, and leave big holes in their DMA rings when
> allocations fail.
>
> The warnings are valid _sometimes_, but not for others. So adding
> __GFP_NOWARN to netdev_alloc_skb() unconditionally makes no sense,
> except as an admission that the "spew when there is memory pressure"
> idea was silly.
>
>
>
> Turning to Nick's comment,
>
>> It's still actually nice to know how often it is happening even for
>> these known good sites because too much can indicate a problem and
>> that you could actually bring performance up by tuning some things.
>
> then create a counter or acculuation buffer somewhere.
>
> We don't need spew every time there is memory pressure of this magnitude.
>
> IMO there are much better ways than printk(), to inform tasks, and
> humans, of allocation failures.
FYI e1000 and family already count various levels of alloc failures resulting from
this:
alloc_rx_buff_failed - page alloc failure (might be harmless)
rx_no_buffer_count - no buffer available for HW to use (harmless, hw will retry)
rx_missed_errors - hw dropped a packet because of above failures
still I personally think the page alloc warnings are a good thing and we've had
several issues resolve quickly because of them.
shutting them up completely moves the focus to our driver which ends up being a
victim of suspicion, and we have to circle around hard to convince the user otherwise.
Auke
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 10:56 ` KOSAKI Motohiro
@ 2008-04-02 18:44 ` David Miller
2008-04-02 20:12 ` Michael Chan
0 siblings, 1 reply; 35+ messages in thread
From: David Miller @ 2008-04-02 18:44 UTC (permalink / raw)
To: kosaki.motohiro
Cc: akpm, csnook, davej, nickpiggin, linux-kernel, netdev, mchan
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Wed, 02 Apr 2008 19:56:05 +0900
>
> > ---------------------------------------------------------------------
> > We get rather a lot of reports of page allocation warnings coming out of
> > tg3. But this driver is know to handle them properly so let's suppress
> > them.
>
> BTW:
> other person's tg3 allocation failure bug report.
>
> http://bugzilla.kernel.org/show_bug.cgi?id=9771
Yes, I know about it. It will occur for any device
which must use linear RX buffers and supports
>PAGE_SIZE MTUs.
But modern chips are going to support segmentation of
receive packets into multiple individual pages and
therefore not have this order>0 page allocation
requirement.
That is my point.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 11:04 ` Peter Zijlstra
@ 2008-04-02 18:45 ` David Miller
2008-04-02 19:06 ` Jeff Garzik
0 siblings, 1 reply; 35+ messages in thread
From: David Miller @ 2008-04-02 18:45 UTC (permalink / raw)
To: peterz
Cc: kosaki.motohiro, akpm, csnook, davej, nickpiggin, linux-kernel, netdev
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 02 Apr 2008 13:04:59 +0200
> Would we not hope that most net drivers can handle {,net}dev_alloc_skb()
> failing? Otherwise we have some serious trouble.
False presumption.
Most can but some legacy ones really do not handle this well.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 18:45 ` David Miller
@ 2008-04-02 19:06 ` Jeff Garzik
0 siblings, 0 replies; 35+ messages in thread
From: Jeff Garzik @ 2008-04-02 19:06 UTC (permalink / raw)
To: David Miller
Cc: peterz, kosaki.motohiro, akpm, csnook, davej, nickpiggin,
linux-kernel, netdev
David Miller wrote:
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Wed, 02 Apr 2008 13:04:59 +0200
>
>> Would we not hope that most net drivers can handle {,net}dev_alloc_skb()
>> failing? Otherwise we have some serious trouble.
>
> False presumption.
>
> Most can but some legacy ones really do not handle this well.
Correct...
In addition, I remain worried about potential edge cases in drivers that
do "burst refill" style allocations, rather than the [IMO safer]
as-you-go style that tg3 employs.
I haven't done a serious audit, but I have nagging doubts about the
behavior of burst-refill drivers when faced with a burst of failed
allocations... do they stop DMA operation, or do they permit hardware
to wander into part of the DMA ring where, in the previous "cycle"
through the ring, valid DMA addresses were placed. But now with a
string of allocation failures, you have a sequence of DMA ring
descriptors that point to invalid memory. If the DMA engine is not
stopped, or in some other way made to avoid those invalid DMA
descriptors, then you can easily run into problems.
Sometimes, it's as easy as making sure to properly police the 'OWN' bit
of each descriptor, something easily and commonly done as a matter of
course. But modern hardware with multi-descriptor packets make things
more complex, so it is actually getting more difficult to get these
details right.
</ramble>
Jeff
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 18:44 ` David Miller
@ 2008-04-02 20:12 ` Michael Chan
2008-04-02 20:53 ` David Miller
0 siblings, 1 reply; 35+ messages in thread
From: Michael Chan @ 2008-04-02 20:12 UTC (permalink / raw)
To: David Miller, mcarlson
Cc: kosaki.motohiro, akpm, csnook, davej, nickpiggin, linux-kernel, netdev
On Wed, 2008-04-02 at 11:44 -0700, David Miller wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Wed, 02 Apr 2008 19:56:05 +0900
>
> > BTW:
> > other person's tg3 allocation failure bug report.
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=9771
>
> Yes, I know about it. It will occur for any device
> which must use linear RX buffers and supports
> >PAGE_SIZE MTUs.
>
> But modern chips are going to support segmentation of
> receive packets into multiple individual pages and
> therefore not have this order>0 page allocation
> requirement.
tg3 hardware has a jumbo rx ring that can accommodate non-linear SKBs.
Matt is working on some patches to add this feature.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 20:12 ` Michael Chan
@ 2008-04-02 20:53 ` David Miller
0 siblings, 0 replies; 35+ messages in thread
From: David Miller @ 2008-04-02 20:53 UTC (permalink / raw)
To: mchan
Cc: mcarlson, kosaki.motohiro, akpm, csnook, davej, nickpiggin,
linux-kernel, netdev
From: "Michael Chan" <mchan@broadcom.com>
Date: Wed, 02 Apr 2008 12:12:06 -0800
> tg3 hardware has a jumbo rx ring that can accommodate non-linear SKBs.
> Matt is working on some patches to add this feature.
Excellent.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 15:54 ` Andrew Morton
@ 2008-04-03 5:22 ` Nick Piggin
2008-04-03 5:32 ` Andrew Morton
0 siblings, 1 reply; 35+ messages in thread
From: Nick Piggin @ 2008-04-03 5:22 UTC (permalink / raw)
To: Andrew Morton
Cc: Chris Snook, Dave Jones, Linux Kernel, netdev, Peter Zijlstra
On Thursday 03 April 2008 02:54, Andrew Morton wrote:
> On Wed, 2 Apr 2008 20:12:58 +1100 Nick Piggin <nickpiggin@yahoo.com.au>
wrote:
> > On Wednesday 02 April 2008 18:56, Andrew Morton wrote:
> > > > Limiting this to once per boot should suffice for debugging purposes.
> > > > Even if you manage to concoct a bug that always survives the first
> > > > failure, you should be able to take the hint when you keep seeing
> > > > this in dmesg.
> > >
> > > The appropriate thing to do here is to convert known-good drivers (such
> > > as e1000[e]) to use __GFP_NOWARN.
> > >
> > > Unfortunately netdev_alloc_skb() went and assumed GFP_ATOMIC, but I
> > > guess we can dive below the covers and use __netdev_alloc_skb():
> >
> > It's still actually nice to know how often it is happening even for
> > these known good sites because too much can indicate a problem and
> > that you could actually bring performance up by tuning some things.
>
> Yes, it's useful debugging. It tells us when we mucked up the page
> allocator.
Right.
> It also tells us when we mucked up the net driver - I doubt if we (or at
> least, I) would have discovered that e1000 does a 32k allocation for a
> 5k(?) frame if this warning wasn't coming out.
Is that right? If it is allocating for 9K MTU, then the slab allocator
(slub in this case) will bump that up to the 16K kmalloc slab. If it is
a 5K frame, then it would get the 8K kmalloc slab I think.
Oh, but SLUB's default MIN_OBJECTS is 4, so 4*8 is 32 indeed. So slub
is probably deciding to round the kmalloc-8192 allocations up to order-3.
I think. How did you know it was a 5k frame? :)
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-03 5:22 ` Nick Piggin
@ 2008-04-03 5:32 ` Andrew Morton
2008-04-03 8:59 ` Evgeniy Polyakov
0 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2008-04-03 5:32 UTC (permalink / raw)
To: Nick Piggin; +Cc: Chris Snook, Dave Jones, Linux Kernel, netdev, Peter Zijlstra
On Thu, 3 Apr 2008 16:22:26 +1100 Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> > It also tells us when we mucked up the net driver - I doubt if we (or at
> > least, I) would have discovered that e1000 does a 32k allocation for a
> > 5k(?) frame if this warning wasn't coming out.
>
> Is that right? If it is allocating for 9K MTU, then the slab allocator
> (slub in this case) will bump that up to the 16K kmalloc slab. If it is
> a 5K frame, then it would get the 8K kmalloc slab I think.
>
> Oh, but SLUB's default MIN_OBJECTS is 4, so 4*8 is 32 indeed. So slub
> is probably deciding to round the kmalloc-8192 allocations up to order-3.
> I think. How did you know it was a 5k frame? :)
urgh, it was a while ago, and I don't know if e1000e retains the behaviour.
iirc the issue was with some errant versions of the hardware needing
exorbitant alignment and additional padding at the end because of
occasional DMA overruns. Something like that.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-02 18:18 ` Jeff Garzik
2008-04-02 18:37 ` Kok, Auke
@ 2008-04-03 5:57 ` Nick Piggin
2008-04-03 18:20 ` Jeff Garzik
1 sibling, 1 reply; 35+ messages in thread
From: Nick Piggin @ 2008-04-03 5:57 UTC (permalink / raw)
To: Jeff Garzik
Cc: Andrew Morton, Chris Snook, Dave Jones, Linux Kernel, NetDev,
David Miller, Linus Torvalds
On Thursday 03 April 2008 05:18, Jeff Garzik wrote:
> Turning to Nick's comment,
>
> > It's still actually nice to know how often it is happening even for
> > these known good sites because too much can indicate a problem and
> > that you could actually bring performance up by tuning some things.
>
> then create a counter or acculuation buffer somewhere.
>
> We don't need spew every time there is memory pressure of this magnitude.
Not a complete solution. Counter would be nice, but you need backtraces
and want a way to more proactively warn the user/tester/developer.
I agree that I don't exactly like adding nowarns around, and I don't think
places like driver writers should have to know about this stuff.
> IMO there are much better ways than printk(), to inform tasks, and
> humans, of allocation failures.
I think with a tweaked warning message, a ratelimited printk is OK.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-03 5:32 ` Andrew Morton
@ 2008-04-03 8:59 ` Evgeniy Polyakov
2008-06-26 21:06 ` Dave Jones
0 siblings, 1 reply; 35+ messages in thread
From: Evgeniy Polyakov @ 2008-04-03 8:59 UTC (permalink / raw)
To: Andrew Morton
Cc: Nick Piggin, Chris Snook, Dave Jones, Linux Kernel, netdev,
Peter Zijlstra
On Wed, Apr 02, 2008 at 10:32:54PM -0700, Andrew Morton (akpm@linux-foundation.org) wrote:
> > > It also tells us when we mucked up the net driver - I doubt if we (or at
> > > least, I) would have discovered that e1000 does a 32k allocation for a
> > > 5k(?) frame if this warning wasn't coming out.
> >
> > Is that right? If it is allocating for 9K MTU, then the slab allocator
> > (slub in this case) will bump that up to the 16K kmalloc slab. If it is
> > a 5K frame, then it would get the 8K kmalloc slab I think.
> >
> > Oh, but SLUB's default MIN_OBJECTS is 4, so 4*8 is 32 indeed. So slub
> > is probably deciding to round the kmalloc-8192 allocations up to order-3.
> > I think. How did you know it was a 5k frame? :)
>
> urgh, it was a while ago, and I don't know if e1000e retains the behaviour.
>
> iirc the issue was with some errant versions of the hardware needing
> exorbitant alignment and additional padding at the end because of
> occasional DMA overruns. Something like that.
e1000 hardware does require power-of-two alignment, network stack adds
additional structure at the end, so with e1000 it ends up with two
rounds to the higher power of two.
5k ends up with 16k allocations, 9k - to 32k.
This problem is known for years already and number of fixes was
proposed, but the really good one is to rewrite e1000 allocation path to
use fragments, which I believe was done in the new e1000 driver.
And as a side note: shuting allocation failures is a very bad step,
since it hides allocation problems for drivers. if people do care about
it add __GFP_SMALL_WARN flag which will just print that allocation
failed, its order and function where it happend.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-03 5:57 ` Nick Piggin
@ 2008-04-03 18:20 ` Jeff Garzik
0 siblings, 0 replies; 35+ messages in thread
From: Jeff Garzik @ 2008-04-03 18:20 UTC (permalink / raw)
To: Nick Piggin
Cc: Andrew Morton, Chris Snook, Dave Jones, Linux Kernel, NetDev,
David Miller, Linus Torvalds
Nick Piggin wrote:
> On Thursday 03 April 2008 05:18, Jeff Garzik wrote:
>
>> Turning to Nick's comment,
>>
>>> It's still actually nice to know how often it is happening even for
>>> these known good sites because too much can indicate a problem and
>>> that you could actually bring performance up by tuning some things.
>> then create a counter or acculuation buffer somewhere.
>>
>> We don't need spew every time there is memory pressure of this magnitude.
>
> Not a complete solution. Counter would be nice, but you need backtraces
> and want a way to more proactively warn the user/tester/developer.
>
> I agree that I don't exactly like adding nowarns around, and I don't think
> places like driver writers should have to know about this stuff.
>
>
>> IMO there are much better ways than printk(), to inform tasks, and
>> humans, of allocation failures.
>
> I think with a tweaked warning message, a ratelimited printk is OK.
No objections here, and agreed on all points.
Though IMO adding __GFP_NOWARN to netdev_alloc_skb() falls into that
category (should not generally be in a driver or driver API).
Jeff
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-03 8:59 ` Evgeniy Polyakov
@ 2008-06-26 21:06 ` Dave Jones
2008-06-26 22:26 ` Chris Snook
2008-06-27 10:01 ` Evgeniy Polyakov
0 siblings, 2 replies; 35+ messages in thread
From: Dave Jones @ 2008-06-26 21:06 UTC (permalink / raw)
To: Evgeniy Polyakov
Cc: Andrew Morton, Nick Piggin, Chris Snook, Dave Jones,
Linux Kernel, netdev, Peter Zijlstra
This thread seemed to die out with no resolution..
On Thu, Apr 03, 2008 at 12:59:22PM +0400, Evgeniy Polyakov wrote:
> On Wed, Apr 02, 2008 at 10:32:54PM -0700, Andrew Morton (akpm@linux-foundation.org) wrote:
> > > > It also tells us when we mucked up the net driver - I doubt if we (or at
> > > > least, I) would have discovered that e1000 does a 32k allocation for a
> > > > 5k(?) frame if this warning wasn't coming out.
> > >
> > > Is that right? If it is allocating for 9K MTU, then the slab allocator
> > > (slub in this case) will bump that up to the 16K kmalloc slab. If it is
> > > a 5K frame, then it would get the 8K kmalloc slab I think.
> > >
> > > Oh, but SLUB's default MIN_OBJECTS is 4, so 4*8 is 32 indeed. So slub
> > > is probably deciding to round the kmalloc-8192 allocations up to order-3.
> > > I think. How did you know it was a 5k frame? :)
> >
> > urgh, it was a while ago, and I don't know if e1000e retains the behaviour.
> >
> > iirc the issue was with some errant versions of the hardware needing
> > exorbitant alignment and additional padding at the end because of
> > occasional DMA overruns. Something like that.
>
> e1000 hardware does require power-of-two alignment, network stack adds
> additional structure at the end, so with e1000 it ends up with two
> rounds to the higher power of two.
> 5k ends up with 16k allocations, 9k - to 32k.
>
> This problem is known for years already and number of fixes was
> proposed, but the really good one is to rewrite e1000 allocation path to
> use fragments, which I believe was done in the new e1000 driver.
So this morning, we got a fresh report from this in 2.6.25.6's e1000 driver
https://bugzilla.redhat.com/show_bug.cgi?id=453010
Pages and pages of spew, which make users freak out.
This stuff might be 'nice to know', but if it isn't getting fixed,
I can see why some distros have been shipping the 'silence GFP_ATOMIC failures'
patches for some time.
Dave
> And as a side note: shuting allocation failures is a very bad step,
> since it hides allocation problems for drivers. if people do care about
> it add __GFP_SMALL_WARN flag which will just print that allocation
> failed, its order and function where it happend.
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-06-26 21:06 ` Dave Jones
@ 2008-06-26 22:26 ` Chris Snook
2008-06-27 10:01 ` Evgeniy Polyakov
1 sibling, 0 replies; 35+ messages in thread
From: Chris Snook @ 2008-06-26 22:26 UTC (permalink / raw)
To: Dave Jones, Evgeniy Polyakov, Andrew Morton, Nick Piggin,
Chris Snook, Dave Jones, Linux Kernel, netdev, Peter Zijlstra
Dave Jones wrote:
> This thread seemed to die out with no resolution..
>
> On Thu, Apr 03, 2008 at 12:59:22PM +0400, Evgeniy Polyakov wrote:
> > On Wed, Apr 02, 2008 at 10:32:54PM -0700, Andrew Morton (akpm@linux-foundation.org) wrote:
> > > > > It also tells us when we mucked up the net driver - I doubt if we (or at
> > > > > least, I) would have discovered that e1000 does a 32k allocation for a
> > > > > 5k(?) frame if this warning wasn't coming out.
> > > >
> > > > Is that right? If it is allocating for 9K MTU, then the slab allocator
> > > > (slub in this case) will bump that up to the 16K kmalloc slab. If it is
> > > > a 5K frame, then it would get the 8K kmalloc slab I think.
> > > >
> > > > Oh, but SLUB's default MIN_OBJECTS is 4, so 4*8 is 32 indeed. So slub
> > > > is probably deciding to round the kmalloc-8192 allocations up to order-3.
> > > > I think. How did you know it was a 5k frame? :)
> > >
> > > urgh, it was a while ago, and I don't know if e1000e retains the behaviour.
> > >
> > > iirc the issue was with some errant versions of the hardware needing
> > > exorbitant alignment and additional padding at the end because of
> > > occasional DMA overruns. Something like that.
> >
> > e1000 hardware does require power-of-two alignment, network stack adds
> > additional structure at the end, so with e1000 it ends up with two
> > rounds to the higher power of two.
> > 5k ends up with 16k allocations, 9k - to 32k.
> >
> > This problem is known for years already and number of fixes was
> > proposed, but the really good one is to rewrite e1000 allocation path to
> > use fragments, which I believe was done in the new e1000 driver.
>
> So this morning, we got a fresh report from this in 2.6.25.6's e1000 driver
> https://bugzilla.redhat.com/show_bug.cgi?id=453010
> Pages and pages of spew, which make users freak out.
> This stuff might be 'nice to know', but if it isn't getting fixed,
> I can see why some distros have been shipping the 'silence GFP_ATOMIC failures'
> patches for some time.
>
> Dave
>
> > And as a side note: shuting allocation failures is a very bad step,
> > since it hides allocation problems for drivers. if people do care about
> > it add __GFP_SMALL_WARN flag which will just print that allocation
> > failed, its order and function where it happend.
Maybe we should make __GFP_SMALL_WARN the default behavior, and allow subsystems
to silence or print the full warning if they see fit. That solves most of the
spew, while still preserving the most important information for debugging, and
still gives subsystem maintainers the power to do as they see fit.
-- Chris
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-06-26 21:06 ` Dave Jones
2008-06-26 22:26 ` Chris Snook
@ 2008-06-27 10:01 ` Evgeniy Polyakov
1 sibling, 0 replies; 35+ messages in thread
From: Evgeniy Polyakov @ 2008-06-27 10:01 UTC (permalink / raw)
To: Dave Jones, Andrew Morton, Nick Piggin, Chris Snook, Dave Jones,
Linux Kernel, netdev, Peter Zijlstra
On Thu, Jun 26, 2008 at 05:06:24PM -0400, Dave Jones (davej@redhat.com) wrote:
> This thread seemed to die out with no resolution..
Please ask to test e1000e driver instead, if it supports given PCI ids.
It has completely rewritten input path which does not suffer from this
kind of problems, so you do not need to care too much about e1000, which
will die sooner or later.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-05 1:06 ` Nick Piggin
@ 2008-04-06 12:12 ` Bodo Eggert
0 siblings, 0 replies; 35+ messages in thread
From: Bodo Eggert @ 2008-04-06 12:12 UTC (permalink / raw)
To: Nick Piggin
Cc: Bodo Eggert, Jeff Garzik, Andrew Morton, Chris Snook, Dave Jones,
Linux Kernel, NetDev, David Miller, Linus Torvalds
On Sat, 5 Apr 2008, Nick Piggin wrote:
> On Friday 04 April 2008 22:35, Bodo Eggert wrote:
> > On Fri, 4 Apr 2008, Nick Piggin wrote:
> > > But actually developers do sometimes want see the event even if it is
> > > relatively infrequent...
> >
> > You shouldn't frighten the users either. /proc/sys/vm/debug?
>
> Hence changing the message a to make it clear that it is not a
> problem if it is infrequent.
If it's no problem when infrequent, and if it's possible and cheap (less
extra code than sizeof(explanation)) to not show a message if it's
infrequent, why should the users be bothered at all? The system needs less
than one ms to do the job, the admin needs five minutes to grep the logs.
--
Top 100 things you don't want the sysadmin to say:
72. My leave starts tomorrow.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-04 11:35 ` Bodo Eggert
@ 2008-04-05 1:06 ` Nick Piggin
2008-04-06 12:12 ` Bodo Eggert
0 siblings, 1 reply; 35+ messages in thread
From: Nick Piggin @ 2008-04-05 1:06 UTC (permalink / raw)
To: Bodo Eggert
Cc: Jeff Garzik, Andrew Morton, Chris Snook, Dave Jones,
Linux Kernel, NetDev, David Miller, Linus Torvalds
On Friday 04 April 2008 22:35, Bodo Eggert wrote:
> On Fri, 4 Apr 2008, Nick Piggin wrote:
> > But actually developers do sometimes want see the event even if it is
> > relatively infrequent...
>
> You shouldn't frighten the users either. /proc/sys/vm/debug?
Hence changing the message a to make it clear that it is not a
problem if it is infrequent.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-04 10:59 ` Nick Piggin
@ 2008-04-04 11:35 ` Bodo Eggert
2008-04-05 1:06 ` Nick Piggin
0 siblings, 1 reply; 35+ messages in thread
From: Bodo Eggert @ 2008-04-04 11:35 UTC (permalink / raw)
To: Nick Piggin
Cc: 7eggert, Jeff Garzik, Andrew Morton, Chris Snook, Dave Jones,
Linux Kernel, NetDev, David Miller, Linus Torvalds
On Fri, 4 Apr 2008, Nick Piggin wrote:
> On Friday 04 April 2008 20:52, Bodo Eggert wrote:
> > Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> > > On Thursday 03 April 2008 05:18, Jeff Garzik wrote:
> > >> Turning to Nick's comment,
> > >>
> > >> > It's still actually nice to know how often it is happening even for
> > >> > these known good sites because too much can indicate a problem and
> > >> > that you could actually bring performance up by tuning some things.
> > >>
> > >> then create a counter or acculuation buffer somewhere.
> > >>
> > >> We don't need spew every time there is memory pressure of this
> > >> magnitude.
> > >
> > > Not a complete solution. Counter would be nice, but you need backtraces
> > > and want a way to more proactively warn the user/tester/developer.
> > >
> > > I agree that I don't exactly like adding nowarns around, and I don't
> > > think places like driver writers should have to know about this stuff.
> >
> > What about reverse ratelimiting: If the limit is reached, a backtrace will
> > be generated (and, off cause, positively ratelimited)?
>
> I was thinking about that. I got as far as writing a simple patch to
> printk so that it would not start to trigger until it gets a 2nd event
> within 'n' jiffies of the first.
I think there was a standalone ratelimit function. I'd use it like this:
static atomic_alloc_ratelimit; /* needs to be initialized ... */
{
...
if (success)
return mem;
if(!debug && ratelimit(atomic_alloc_ratelimit))
return err_ptr(-ENOMEM);
if (printk_ratelimit(first line) > 0) {
printk(rest);
}
}
> But actually developers do sometimes want see the event even if it is
> relatively infrequent...
You shouldn't frighten the users either. /proc/sys/vm/debug?
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
2008-04-04 9:52 ` Bodo Eggert
@ 2008-04-04 10:59 ` Nick Piggin
2008-04-04 11:35 ` Bodo Eggert
0 siblings, 1 reply; 35+ messages in thread
From: Nick Piggin @ 2008-04-04 10:59 UTC (permalink / raw)
To: 7eggert
Cc: Jeff Garzik, Andrew Morton, Chris Snook, Dave Jones,
Linux Kernel, NetDev, David Miller, Linus Torvalds
On Friday 04 April 2008 20:52, Bodo Eggert wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> > On Thursday 03 April 2008 05:18, Jeff Garzik wrote:
> >> Turning to Nick's comment,
> >>
> >> > It's still actually nice to know how often it is happening even for
> >> > these known good sites because too much can indicate a problem and
> >> > that you could actually bring performance up by tuning some things.
> >>
> >> then create a counter or acculuation buffer somewhere.
> >>
> >> We don't need spew every time there is memory pressure of this
> >> magnitude.
> >
> > Not a complete solution. Counter would be nice, but you need backtraces
> > and want a way to more proactively warn the user/tester/developer.
> >
> > I agree that I don't exactly like adding nowarns around, and I don't
> > think places like driver writers should have to know about this stuff.
>
> What about reverse ratelimiting: If the limit is reached, a backtrace will
> be generated (and, off cause, positively ratelimited)?
I was thinking about that. I got as far as writing a simple patch to
printk so that it would not start to trigger until it gets a 2nd event
within 'n' jiffies of the first.
But actually developers do sometimes want see the event even if it is
relatively infrequent...
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: GFP_ATOMIC page allocation failures.
[not found] ` <aeqF6-45P-29@gated-at.bofh.it>
@ 2008-04-04 9:52 ` Bodo Eggert
2008-04-04 10:59 ` Nick Piggin
0 siblings, 1 reply; 35+ messages in thread
From: Bodo Eggert @ 2008-04-04 9:52 UTC (permalink / raw)
To: Nick Piggin, Jeff Garzik, Andrew Morton, Chris Snook, Dave Jones,
Linux Kernel, NetDev, David Miller, Linus Torvalds
Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> On Thursday 03 April 2008 05:18, Jeff Garzik wrote:
>> Turning to Nick's comment,
>>
>> > It's still actually nice to know how often it is happening even for
>> > these known good sites because too much can indicate a problem and
>> > that you could actually bring performance up by tuning some things.
>>
>> then create a counter or acculuation buffer somewhere.
>>
>> We don't need spew every time there is memory pressure of this magnitude.
>
> Not a complete solution. Counter would be nice, but you need backtraces
> and want a way to more proactively warn the user/tester/developer.
>
> I agree that I don't exactly like adding nowarns around, and I don't think
> places like driver writers should have to know about this stuff.
What about reverse ratelimiting: If the limit is reached, a backtrace will be
generated (and, off cause, positively ratelimited)?
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2008-06-27 10:02 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-01 23:56 GFP_ATOMIC page allocation failures Dave Jones
2008-04-02 1:28 ` Nick Piggin
2008-04-02 1:35 ` Dave Jones
2008-04-02 6:28 ` Chris Snook
2008-04-02 7:56 ` Andrew Morton
2008-04-02 8:17 ` KOSAKI Motohiro
2008-04-02 8:24 ` David Miller
2008-04-02 8:43 ` Andrew Morton
2008-04-02 10:00 ` KOSAKI Motohiro
2008-04-02 10:56 ` KOSAKI Motohiro
2008-04-02 18:44 ` David Miller
2008-04-02 20:12 ` Michael Chan
2008-04-02 20:53 ` David Miller
2008-04-02 11:04 ` Peter Zijlstra
2008-04-02 18:45 ` David Miller
2008-04-02 19:06 ` Jeff Garzik
2008-04-02 9:12 ` Nick Piggin
2008-04-02 15:54 ` Andrew Morton
2008-04-03 5:22 ` Nick Piggin
2008-04-03 5:32 ` Andrew Morton
2008-04-03 8:59 ` Evgeniy Polyakov
2008-06-26 21:06 ` Dave Jones
2008-06-26 22:26 ` Chris Snook
2008-06-27 10:01 ` Evgeniy Polyakov
2008-04-02 17:21 ` Jeff Garzik
2008-04-02 17:33 ` Andrew Morton
2008-04-02 18:18 ` Jeff Garzik
2008-04-02 18:37 ` Kok, Auke
2008-04-03 5:57 ` Nick Piggin
2008-04-03 18:20 ` Jeff Garzik
[not found] <adYyJ-20N-11@gated-at.bofh.it>
[not found] ` <aef6w-6rx-45@gated-at.bofh.it>
[not found] ` <aefJ9-7KR-15@gated-at.bofh.it>
[not found] ` <aeqF6-45P-29@gated-at.bofh.it>
2008-04-04 9:52 ` Bodo Eggert
2008-04-04 10:59 ` Nick Piggin
2008-04-04 11:35 ` Bodo Eggert
2008-04-05 1:06 ` Nick Piggin
2008-04-06 12:12 ` Bodo Eggert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).