LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/3 v2] Fix kdump failures with crashkernel=high
@ 2015-01-06 14:51 Joerg Roedel
  2015-01-06 14:51 ` [PATCH 1/3] swiotlb: Warn on allocation failure in swiotlb_alloc_coherent Joerg Roedel
                   ` (5 more replies)
  0 siblings, 6 replies; 34+ messages in thread
From: Joerg Roedel @ 2015-01-06 14:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Konrad Rzeszutek Wilk
  Cc: x86, linux-kernel, Joerg Roedel, Joerg Roedel

v1->v2:

* Updated comments based on feedback from Konrad
* Added Acked-bys
* Rebased to v3.19-rc3

Hi,

here is a patch-set to fix failed kdump kernel boots when
the systems was booted with crashkernel=X,high. On those
systems the kernel allocates only 72MiB of low-memory for
DMA buffers, which showed to be too low on some systems.

The problem is that 64MiB of the low-memory is allocated by
swiotlb, leaving 8MB for the page-allocator. But swiotlb
tries to allocate DMA memory from the page-allocator first,
which fails pretty fast in the boot sequence, causing
warnings. This patch-set removes these warnings.

But even the 64MiB for swiotlb are eaten up on some systems,
so that the default of low-memory allocated for the
crash-kernel is increase from 72MB to 256MB (only changing
the defaults, can still be overwritten by crashkernel=X,low).

This number comes from experiments on the affected systems,
128MiB low-memory was still not enough there, thus I set the
value to 256MiB to fix the issues.

Any feedback appreciated.

Thanks,

	Joerg

Joerg Roedel (3):
  swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
  x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  x86, crash: Allocate enough low-mem when crashkernel=high

 arch/x86/kernel/pci-swiotlb.c |  8 ++++++++
 arch/x86/kernel/setup.c       |  5 ++++-
 lib/swiotlb.c                 | 11 +++++++++--
 3 files changed, 21 insertions(+), 3 deletions(-)

-- 
1.9.1

Joerg Roedel (3):
  swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
  x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  x86, crash: Allocate enough low-mem when crashkernel=high

 arch/x86/kernel/pci-swiotlb.c |  8 ++++++++
 arch/x86/kernel/setup.c       |  5 ++++-
 lib/swiotlb.c                 | 11 +++++++++--
 3 files changed, 21 insertions(+), 3 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/3] swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
  2015-01-06 14:51 [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
@ 2015-01-06 14:51 ` Joerg Roedel
  2015-01-23 17:04   ` Borislav Petkov
  2015-01-06 14:51 ` [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN Joerg Roedel
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-01-06 14:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Konrad Rzeszutek Wilk
  Cc: x86, linux-kernel, Joerg Roedel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Print a warning when all allocation tries have been failed
and the function is about to return NULL. This prepares for
calling the function with __GFP_NOWARN to suppress
allocation failure warnings before all fall-backs have
failed.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 4abda07..e0e9212 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -655,7 +655,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 		 */
 		phys_addr_t paddr = map_single(hwdev, 0, size, DMA_FROM_DEVICE);
 		if (paddr == SWIOTLB_MAP_ERROR)
-			return NULL;
+			goto err_warn;
 
 		ret = phys_to_virt(paddr);
 		dev_addr = phys_to_dma(hwdev, paddr);
@@ -669,7 +669,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 			/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
 			swiotlb_tbl_unmap_single(hwdev, paddr,
 						 size, DMA_TO_DEVICE);
-			return NULL;
+			goto err_warn;
 		}
 	}
 
@@ -677,6 +677,13 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 	memset(ret, 0, size);
 
 	return ret;
+
+err_warn:
+	pr_warn("swiotlb: coherent allocation failed for device %s size=%zu\n",
+		dev_name(hwdev), size);
+	dump_stack();
+
+	return NULL;
 }
 EXPORT_SYMBOL(swiotlb_alloc_coherent);
 
-- 
1.9.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2015-01-06 14:51 [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
  2015-01-06 14:51 ` [PATCH 1/3] swiotlb: Warn on allocation failure in swiotlb_alloc_coherent Joerg Roedel
@ 2015-01-06 14:51 ` Joerg Roedel
  2015-01-23 17:03   ` Borislav Petkov
  2015-01-06 14:51 ` [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high Joerg Roedel
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-01-06 14:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Konrad Rzeszutek Wilk
  Cc: x86, linux-kernel, Joerg Roedel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

When we boot a kdump kernel in high memory, there is by
default only 72MB of low memory available. The swiotlb code
takes 64MB of it (by default) so that there are only 8MB
left to allocate from. On systems with many devices this
causes page allocator warnings from dma_generic_alloc_coherent():

systemd-udevd: page allocation failure: order:0, mode:0x280d4
CPU: 0 PID: 197 Comm: systemd-udevd Tainted: G        W       3.12.28-4-default #1
Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
 ffff8800781335e0 ffffffff8150b1db 00000000000280d4 ffffffff8113af90
 0000000000000000 0000000000000000 ffff88007efdbb00 0000000100000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000001
Call Trace:
 [<ffffffff8100467d>] dump_trace+0x7d/0x2d0
 [<ffffffff81004964>] show_stack_log_lvl+0x94/0x170
 [<ffffffff81005d91>] show_stack+0x21/0x50
 [<ffffffff8150b1db>] dump_stack+0x41/0x51
 [<ffffffff8113af90>] warn_alloc_failed+0xf0/0x160
 [<ffffffff8150763a>] __alloc_pages_slowpath+0x72f/0x796
 [<ffffffff8113ee7a>] __alloc_pages_nodemask+0x1ea/0x210
 [<ffffffff81008256>] dma_generic_alloc_coherent+0x96/0x140
 [<ffffffff8103fccc>] x86_swiotlb_alloc_coherent+0x1c/0x50
 [<ffffffffa048ae5b>] ttm_dma_pool_alloc_new_pages+0xab/0x320 [ttm]
 [<ffffffffa048bc6e>] ttm_dma_populate+0x3ce/0x640 [ttm]
 [<ffffffffa0486>] ttm_tt_bind+0x36/0x60 [ttm]
 [<ffffffffa0484faf>] ttm_bo_handle_move_mem+0x55f/0x5c0 [ttm]
 [<ffffffffa0485be5>] ttm_bo_move_buffer+0x105/0x130 [ttm]
 [<ffffffffa0485cd1>] ttm_bo_validate+0xc1/0x130 [ttm]
 [<ffffffffa0485f8b>] ttm_bo_init+0x24b/0x400 [ttm]
 [<ffffffffa054f8bc>] radeon_bo_create+0x16c/0x200 [radeon]
 [<ffffffffa0563c8e>] radeon_ring_init+0x11e/0x2b0 [radeon]
 [<ffffffffa056c143>] r100_cp_init+0x123/0x5b0 [radeon]
 [<ffffffffa056e8e4>] r100_startup+0x194/0x230 [radeon]
 [<ffffffffa056ece3>] r100_init+0x223/0x410 [radeon]
 [<ffffffffa053495f>] radeon_device_init+0x6af/0x830 [radeon]
 [<ffffffffa0536979>] radeon_driver_load_kms+0x89/0x180 [radeon]
 [<ffffffffa04eeb31>] drm_get_pci_dev+0x121/0x2f0 [drm]
 [<ffffffff812d3ec9>ocal_pci_probe+0x39/0x60
 [<ffffffff812d51e9>] pci_device_probe+0xa9/0x120
 [<ffffffff8139871d>] driver_probe_device+0x9d/0x3d0
 [<ffffffff81398b1b>] __driver_attach+0x8b/0x90
 [<ffffffff8139667b>] bus_for_each_dev+0x5b/0x90
 [<ffffffff81397cd8>] bus_add_driver+0x1f8/0x2c0
 [<ffffffff8139911b>] driver_register+0x5b/0xe0
 [<ffffffff810002c2>] do_one_initcall+0xf2/0x1a0
 [<ffffffff810c8837>] load_module+0x1207/0x1c70
 [<ffffffff810c93f5>] SYSC_finit_module+0x75/0xa0
 [<ffffffff81519329>] system_call_fastpath+0x16/0x1b
 [<00007fac533d2789>] 0x7fac533d2788

After these warnings the code enters a fall-back path and
allocated directly from the swiotlb aperture in the end.
So remove these warnings as this is not a fatal error.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/pci-swiotlb.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 77dd0ad..79b2291 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -20,6 +20,14 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 {
 	void *vaddr;
 
+	/*
+	 * When booting a kdump kernel in high memory these allocations are very
+	 * likely to fail, as there are by default only 8MB of low memory to
+	 * allocate from. So disable the warnings from the allocator when this
+	 * happens.  SWIOTLB also implements fall-backs for failed allocations.
+	 */
+	flags |= __GFP_NOWARN;
+
 	vaddr = dma_generic_alloc_coherent(hwdev, size, dma_handle, flags,
 					   attrs);
 	if (vaddr)
-- 
1.9.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-06 14:51 [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
  2015-01-06 14:51 ` [PATCH 1/3] swiotlb: Warn on allocation failure in swiotlb_alloc_coherent Joerg Roedel
  2015-01-06 14:51 ` [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN Joerg Roedel
@ 2015-01-06 14:51 ` Joerg Roedel
  2015-01-23  8:44   ` Baoquan He
  2015-01-23 17:02   ` Borislav Petkov
  2015-01-14 13:35 ` [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 34+ messages in thread
From: Joerg Roedel @ 2015-01-06 14:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Konrad Rzeszutek Wilk
  Cc: x86, linux-kernel, Joerg Roedel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

When the crashkernel is loaded above 4GiB in memory the
first kernel only allocates 72MiB of low-memory for the DMA
requirements of the second kernel. On systems with many
devices this is not enough and causes device driver
initialization errors and failed crash dumps. Set this
default value to 256MiB to make sure there is enough memory
available for DMA.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/setup.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index ab4734e..d6e6a6d 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -536,8 +536,11 @@ static void __init reserve_crashkernel_low(void)
 		 *	swiotlb overflow buffer: now is hardcoded to 32k.
 		 *		We round it to 8M for other buffers that
 		 *		may need to stay low too.
+		 *		Also make sure we allocate enough extra memory
+		 *		low memory so that we don't run out of DMA
+		 *		buffers for 32bit devices.
 		 */
-		low_size = swiotlb_size_or_default() + (8UL<<20);
+		low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
 		auto_set = true;
 	} else {
 		/* passed with crashkernel=0,low ? */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3 v2] Fix kdump failures with crashkernel=high
  2015-01-06 14:51 [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
                   ` (2 preceding siblings ...)
  2015-01-06 14:51 ` [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high Joerg Roedel
@ 2015-01-14 13:35 ` Joerg Roedel
  2015-01-19 19:26 ` Borislav Petkov
  2015-02-14 10:58 ` Baoquan He
  5 siblings, 0 replies; 34+ messages in thread
From: Joerg Roedel @ 2015-01-14 13:35 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Konrad Rzeszutek Wilk
  Cc: x86, linux-kernel, Joerg Roedel

Ping.

On Tue, Jan 06, 2015 at 03:51:11PM +0100, Joerg Roedel wrote:
> v1->v2:
> 
> * Updated comments based on feedback from Konrad
> * Added Acked-bys
> * Rebased to v3.19-rc3
> 
> Hi,
> 
> here is a patch-set to fix failed kdump kernel boots when
> the systems was booted with crashkernel=X,high. On those
> systems the kernel allocates only 72MiB of low-memory for
> DMA buffers, which showed to be too low on some systems.
> 
> The problem is that 64MiB of the low-memory is allocated by
> swiotlb, leaving 8MB for the page-allocator. But swiotlb
> tries to allocate DMA memory from the page-allocator first,
> which fails pretty fast in the boot sequence, causing
> warnings. This patch-set removes these warnings.
> 
> But even the 64MiB for swiotlb are eaten up on some systems,
> so that the default of low-memory allocated for the
> crash-kernel is increase from 72MB to 256MB (only changing
> the defaults, can still be overwritten by crashkernel=X,low).
> 
> This number comes from experiments on the affected systems,
> 128MiB low-memory was still not enough there, thus I set the
> value to 256MiB to fix the issues.
> 
> Any feedback appreciated.
> 
> Thanks,
> 
> 	Joerg
> 
> Joerg Roedel (3):
>   swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
>   x86, swiotlb: Try coherent allocations with __GFP_NOWARN
>   x86, crash: Allocate enough low-mem when crashkernel=high
> 
>  arch/x86/kernel/pci-swiotlb.c |  8 ++++++++
>  arch/x86/kernel/setup.c       |  5 ++++-
>  lib/swiotlb.c                 | 11 +++++++++--
>  3 files changed, 21 insertions(+), 3 deletions(-)
> 
> -- 
> 1.9.1
> 
> Joerg Roedel (3):
>   swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
>   x86, swiotlb: Try coherent allocations with __GFP_NOWARN
>   x86, crash: Allocate enough low-mem when crashkernel=high
> 
>  arch/x86/kernel/pci-swiotlb.c |  8 ++++++++
>  arch/x86/kernel/setup.c       |  5 ++++-
>  lib/swiotlb.c                 | 11 +++++++++--
>  3 files changed, 21 insertions(+), 3 deletions(-)
> 
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3 v2] Fix kdump failures with crashkernel=high
  2015-01-06 14:51 [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
                   ` (3 preceding siblings ...)
  2015-01-14 13:35 ` [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
@ 2015-01-19 19:26 ` Borislav Petkov
  2015-02-14 10:58 ` Baoquan He
  5 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2015-01-19 19:26 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel,
	Vivek Goyal, Dave Young, kexec

Adding some more people to CC.

On Tue, Jan 06, 2015 at 03:51:11PM +0100, Joerg Roedel wrote:
> v1->v2:
> 
> * Updated comments based on feedback from Konrad
> * Added Acked-bys
> * Rebased to v3.19-rc3
> 
> Hi,
> 
> here is a patch-set to fix failed kdump kernel boots when
> the systems was booted with crashkernel=X,high. On those
> systems the kernel allocates only 72MiB of low-memory for
> DMA buffers, which showed to be too low on some systems.
> 
> The problem is that 64MiB of the low-memory is allocated by
> swiotlb, leaving 8MB for the page-allocator. But swiotlb
> tries to allocate DMA memory from the page-allocator first,
> which fails pretty fast in the boot sequence, causing
> warnings. This patch-set removes these warnings.
> 
> But even the 64MiB for swiotlb are eaten up on some systems,
> so that the default of low-memory allocated for the
> crash-kernel is increase from 72MB to 256MB (only changing
> the defaults, can still be overwritten by crashkernel=X,low).
> 
> This number comes from experiments on the affected systems,
> 128MiB low-memory was still not enough there, thus I set the
> value to 256MiB to fix the issues.
> 
> Any feedback appreciated.
> 
> Thanks,
> 
> 	Joerg
> 
> Joerg Roedel (3):
>   swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
>   x86, swiotlb: Try coherent allocations with __GFP_NOWARN
>   x86, crash: Allocate enough low-mem when crashkernel=high
> 
>  arch/x86/kernel/pci-swiotlb.c |  8 ++++++++
>  arch/x86/kernel/setup.c       |  5 ++++-
>  lib/swiotlb.c                 | 11 +++++++++--
>  3 files changed, 21 insertions(+), 3 deletions(-)
> 
> -- 
> 1.9.1
> 
> Joerg Roedel (3):
>   swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
>   x86, swiotlb: Try coherent allocations with __GFP_NOWARN
>   x86, crash: Allocate enough low-mem when crashkernel=high
> 
>  arch/x86/kernel/pci-swiotlb.c |  8 ++++++++
>  arch/x86/kernel/setup.c       |  5 ++++-
>  lib/swiotlb.c                 | 11 +++++++++--
>  3 files changed, 21 insertions(+), 3 deletions(-)
> 
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-06 14:51 ` [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high Joerg Roedel
@ 2015-01-23  8:44   ` Baoquan He
  2015-01-26 12:07     ` Joerg Roedel
  2015-01-23 17:02   ` Borislav Petkov
  1 sibling, 1 reply; 34+ messages in thread
From: Baoquan He @ 2015-01-23  8:44 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

Hi Joerg,

Yeah, it does happen if too many devices. I guess the reason no reports
come to us on rhel is we always use a auto mechsanism. We always try to
allocate from below 896M, if failed try below 4G, if failed try above
4G.

This could be solved in 2 ways:

1) We could optimize the distro shell scripts which build the initramfs
for kdump. Only include those devices which are necessary in kdump
kernel. E.g for net dump only that NIC which connect to dump target will
be brought up, this can decrease the dma memory requirement.

2) increase low-mem when crashkernel=high. But we have to be careful to
do this. We implement crashkernel=high not only for the unhappiness
crashkernel reservation is limited below 4G, but dma/dma32 memory space
is precious on some systems. If set crashkernel=high still too much low
memory has to be reserved by default, it's important to find the
balance. So if we have to increase the default low-mem, how much memory
is enough, why 256M?  why not 128M/192M/320M/384M?  And if 256M works
on your system, what if another person say it does't work because there
are more devices on his system?

Anyway, I understand the requirement, but we need find out how much
memory can satisfy most of systems.


Thanks
Baoquan

On 01/06/15 at 03:51pm, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> When the crashkernel is loaded above 4GiB in memory the
> first kernel only allocates 72MiB of low-memory for the DMA
> requirements of the second kernel. On systems with many
> devices this is not enough and causes device driver
> initialization errors and failed crash dumps. Set this
> default value to 256MiB to make sure there is enough memory
> available for DMA.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  arch/x86/kernel/setup.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index ab4734e..d6e6a6d 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -536,8 +536,11 @@ static void __init reserve_crashkernel_low(void)
>  		 *	swiotlb overflow buffer: now is hardcoded to 32k.
>  		 *		We round it to 8M for other buffers that
>  		 *		may need to stay low too.
> +		 *		Also make sure we allocate enough extra memory
> +		 *		low memory so that we don't run out of DMA
> +		 *		buffers for 32bit devices.
>  		 */
> -		low_size = swiotlb_size_or_default() + (8UL<<20);
> +		low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
>  		auto_set = true;
>  	} else {
>  		/* passed with crashkernel=0,low ? */
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-06 14:51 ` [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high Joerg Roedel
  2015-01-23  8:44   ` Baoquan He
@ 2015-01-23 17:02   ` Borislav Petkov
  2015-01-26 12:11     ` Joerg Roedel
  1 sibling, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2015-01-23 17:02 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Tue, Jan 06, 2015 at 03:51:14PM +0100, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> When the crashkernel is loaded above 4GiB in memory the
> first kernel only allocates 72MiB of low-memory for the DMA
> requirements of the second kernel. On systems with many
> devices this is not enough and causes device driver
> initialization errors and failed crash dumps. Set this
> default value to 256MiB to make sure there is enough memory

This upper limit of 256 looks arbitrary. Are we going to raise it a
couple of years from now if it becomes insufficient then?

It probably won't be easy but is there some more reliable way to
allocate enough memory for DMA on a say per-system basis or whatever...?
Probably not but let me ask it anyway.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2015-01-06 14:51 ` [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN Joerg Roedel
@ 2015-01-23 17:03   ` Borislav Petkov
  2015-01-26  3:22     ` Baoquan He
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2015-01-23 17:03 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Tue, Jan 06, 2015 at 03:51:13PM +0100, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> When we boot a kdump kernel in high memory, there is by
> default only 72MB of low memory available. The swiotlb code
> takes 64MB of it (by default) so that there are only 8MB
> left to allocate from. On systems with many devices this
> causes page allocator warnings from dma_generic_alloc_coherent():
> 
> systemd-udevd: page allocation failure: order:0, mode:0x280d4
> CPU: 0 PID: 197 Comm: systemd-udevd Tainted: G        W       3.12.28-4-default #1
> Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
>  ffff8800781335e0 ffffffff8150b1db 00000000000280d4 ffffffff8113af90
>  0000000000000000 0000000000000000 ffff88007efdbb00 0000000100000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000001
> Call Trace:
>  [<ffffffff8100467d>] dump_trace+0x7d/0x2d0
>  [<ffffffff81004964>] show_stack_log_lvl+0x94/0x170
>  [<ffffffff81005d91>] show_stack+0x21/0x50
>  [<ffffffff8150b1db>] dump_stack+0x41/0x51
>  [<ffffffff8113af90>] warn_alloc_failed+0xf0/0x160
>  [<ffffffff8150763a>] __alloc_pages_slowpath+0x72f/0x796
>  [<ffffffff8113ee7a>] __alloc_pages_nodemask+0x1ea/0x210
>  [<ffffffff81008256>] dma_generic_alloc_coherent+0x96/0x140
>  [<ffffffff8103fccc>] x86_swiotlb_alloc_coherent+0x1c/0x50
>  [<ffffffffa048ae5b>] ttm_dma_pool_alloc_new_pages+0xab/0x320 [ttm]
>  [<ffffffffa048bc6e>] ttm_dma_populate+0x3ce/0x640 [ttm]
>  [<ffffffffa0486>] ttm_tt_bind+0x36/0x60 [ttm]
>  [<ffffffffa0484faf>] ttm_bo_handle_move_mem+0x55f/0x5c0 [ttm]
>  [<ffffffffa0485be5>] ttm_bo_move_buffer+0x105/0x130 [ttm]
>  [<ffffffffa0485cd1>] ttm_bo_validate+0xc1/0x130 [ttm]
>  [<ffffffffa0485f8b>] ttm_bo_init+0x24b/0x400 [ttm]
>  [<ffffffffa054f8bc>] radeon_bo_create+0x16c/0x200 [radeon]
>  [<ffffffffa0563c8e>] radeon_ring_init+0x11e/0x2b0 [radeon]
>  [<ffffffffa056c143>] r100_cp_init+0x123/0x5b0 [radeon]
>  [<ffffffffa056e8e4>] r100_startup+0x194/0x230 [radeon]
>  [<ffffffffa056ece3>] r100_init+0x223/0x410 [radeon]
>  [<ffffffffa053495f>] radeon_device_init+0x6af/0x830 [radeon]
>  [<ffffffffa0536979>] radeon_driver_load_kms+0x89/0x180 [radeon]
>  [<ffffffffa04eeb31>] drm_get_pci_dev+0x121/0x2f0 [drm]
>  [<ffffffff812d3ec9>ocal_pci_probe+0x39/0x60
>  [<ffffffff812d51e9>] pci_device_probe+0xa9/0x120
>  [<ffffffff8139871d>] driver_probe_device+0x9d/0x3d0
>  [<ffffffff81398b1b>] __driver_attach+0x8b/0x90
>  [<ffffffff8139667b>] bus_for_each_dev+0x5b/0x90
>  [<ffffffff81397cd8>] bus_add_driver+0x1f8/0x2c0
>  [<ffffffff8139911b>] driver_register+0x5b/0xe0
>  [<ffffffff810002c2>] do_one_initcall+0xf2/0x1a0
>  [<ffffffff810c8837>] load_module+0x1207/0x1c70
>  [<ffffffff810c93f5>] SYSC_finit_module+0x75/0xa0
>  [<ffffffff81519329>] system_call_fastpath+0x16/0x1b
>  [<00007fac533d2789>] 0x7fac533d2788

Please remove those addresses and offsets and leave only the function
names in the call trace - the rest is useless clutter only as those
addresses don't mean anything on else but this kernel.

> After these warnings the code enters a fall-back path and
> allocated directly from the swiotlb aperture in the end.
> So remove these warnings as this is not a fatal error.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  arch/x86/kernel/pci-swiotlb.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> index 77dd0ad..79b2291 100644
> --- a/arch/x86/kernel/pci-swiotlb.c
> +++ b/arch/x86/kernel/pci-swiotlb.c
> @@ -20,6 +20,14 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  {
>  	void *vaddr;
>  
> +	/*
> +	 * When booting a kdump kernel in high memory these allocations are very
> +	 * likely to fail, as there are by default only 8MB of low memory to
> +	 * allocate from. So disable the warnings from the allocator when this
> +	 * happens.  SWIOTLB also implements fall-backs for failed allocations.
> +	 */
> +	flags |= __GFP_NOWARN;

Ok, so this practically does all allocations __GFP_NOWARN now. Shouldn't
you be doing this before swiotlb_alloc_coherent() and not before
dma_generic_alloc_coherent()?

>  	vaddr = dma_generic_alloc_coherent(hwdev, size, dma_handle, flags,
>  					   attrs);
>  	if (vaddr)

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
  2015-01-06 14:51 ` [PATCH 1/3] swiotlb: Warn on allocation failure in swiotlb_alloc_coherent Joerg Roedel
@ 2015-01-23 17:04   ` Borislav Petkov
  2015-01-26 11:49     ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2015-01-23 17:04 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Tue, Jan 06, 2015 at 03:51:12PM +0100, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> Print a warning when all allocation tries have been failed
> and the function is about to return NULL. This prepares for
> calling the function with __GFP_NOWARN to suppress
> allocation failure warnings before all fall-backs have
> failed.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---

...

> @@ -677,6 +677,13 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  	memset(ret, 0, size);
>  
>  	return ret;
> +
> +err_warn:
> +	pr_warn("swiotlb: coherent allocation failed for device %s size=%zu\n",
> +		dev_name(hwdev), size);
> +	dump_stack();

Are we really sure we want to be that noisy about it? What happens if
that fails, we can't do DMA anymore or should we free some precious DMA
memory, as a compromise?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2015-01-23 17:03   ` Borislav Petkov
@ 2015-01-26  3:22     ` Baoquan He
  2015-01-26 11:54       ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Baoquan He @ 2015-01-26  3:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On 01/23/15 at 06:03pm, Borislav Petkov wrote:
> On Tue, Jan 06, 2015 at 03:51:13PM +0100, Joerg Roedel wrote:
> > From: Joerg Roedel <jroedel@suse.de>

> > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> > index 77dd0ad..79b2291 100644
> > --- a/arch/x86/kernel/pci-swiotlb.c
> > +++ b/arch/x86/kernel/pci-swiotlb.c
> > @@ -20,6 +20,14 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
> >  {
> >  	void *vaddr;
> >  
> > +	/*
> > +	 * When booting a kdump kernel in high memory these allocations are very
> > +	 * likely to fail, as there are by default only 8MB of low memory to
> > +	 * allocate from. So disable the warnings from the allocator when this
> > +	 * happens.  SWIOTLB also implements fall-backs for failed allocations.
> > +	 */
> > +	flags |= __GFP_NOWARN;
> 
> Ok, so this practically does all allocations __GFP_NOWARN now. Shouldn't
> you be doing this before swiotlb_alloc_coherent() and not before
> dma_generic_alloc_coherent()?

I think this patch mainly suppress warning from buddy allocation
failure because it tried buddy allocation several times before the final
try of bounce buffer allocation. Buddy allocation failure will call
dump_stack.


> 
> >  	vaddr = dma_generic_alloc_coherent(hwdev, size, dma_handle, flags,
> >  					   attrs);
> >  	if (vaddr)
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> ECO tip #101: Trim your mails when you reply.
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
  2015-01-23 17:04   ` Borislav Petkov
@ 2015-01-26 11:49     ` Joerg Roedel
  0 siblings, 0 replies; 34+ messages in thread
From: Joerg Roedel @ 2015-01-26 11:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Fri, Jan 23, 2015 at 06:04:25PM +0100, Borislav Petkov wrote:
> On Tue, Jan 06, 2015 at 03:51:12PM +0100, Joerg Roedel wrote:
> > +
> > +err_warn:
> > +	pr_warn("swiotlb: coherent allocation failed for device %s size=%zu\n",
> > +		dev_name(hwdev), size);
> > +	dump_stack();
> 
> Are we really sure we want to be that noisy about it? What happens if
> that fails, we can't do DMA anymore or should we free some precious DMA
> memory, as a compromise?

Hmm, I don't think there is a way to request drivers to free DMA memory.
What this patch does is to keep the same noisiness as before, just not
from the page allocator but from the very end of the DMA allocation
process, when everything failed.
We can of course discuss whether this is too noisy and put some kind of
ratelimit around.


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2015-01-26  3:22     ` Baoquan He
@ 2015-01-26 11:54       ` Joerg Roedel
  0 siblings, 0 replies; 34+ messages in thread
From: Joerg Roedel @ 2015-01-26 11:54 UTC (permalink / raw)
  To: Baoquan He
  Cc: Borislav Petkov, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Mon, Jan 26, 2015 at 11:22:36AM +0800, Baoquan He wrote:
> > Ok, so this practically does all allocations __GFP_NOWARN now. Shouldn't
> > you be doing this before swiotlb_alloc_coherent() and not before
> > dma_generic_alloc_coherent()?
> 
> I think this patch mainly suppress warning from buddy allocation
> failure because it tried buddy allocation several times before the final
> try of bounce buffer allocation. Buddy allocation failure will call
> dump_stack.

Yes, exactly. The default low-memory available to the page-allocator
with crashkernel=high is 8MB. This is up pretty fast and then we start
to get warnings, even when there is still memory left in the swiotlb
space. The __GFP_WARN is there to suppress the warnings from the
page-allocator, so it has to be set before it is called.


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-23  8:44   ` Baoquan He
@ 2015-01-26 12:07     ` Joerg Roedel
  2015-02-01  8:41       ` Baoquan He
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-01-26 12:07 UTC (permalink / raw)
  To: Baoquan He
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

Hi Baoquan,

thanks for your reply.

On Fri, Jan 23, 2015 at 04:44:53PM +0800, Baoquan He wrote:
> 2) increase low-mem when crashkernel=high. But we have to be careful to
> do this. We implement crashkernel=high not only for the unhappiness
> crashkernel reservation is limited below 4G, but dma/dma32 memory space
> is precious on some systems. If set crashkernel=high still too much low
> memory has to be reserved by default, it's important to find the
> balance. So if we have to increase the default low-mem, how much memory
> is enough, why 256M?  why not 128M/192M/320M/384M?  And if 256M works
> on your system, what if another person say it does't work because there
> are more devices on his system?
> 
> Anyway, I understand the requirement, but we need find out how much
> memory can satisfy most of systems.

Yes, I totally agree that it is tough to find a good default here. I
used 256MB because this is what was required on the system the  failed
kdumps were reported on.

But probably we can agree that 72MB is not enough (given that 64MB are
taken away by swiotlb already), and increase it to a value we think by
now is sufficient for most systems.

Btw, the issue was also reported on machines with only a few devices,
the reason there is that device drivers allocate more dma memory by
default on intilization. Maybe we should handle that as a driver
regression in the future, forcing them to allocate more dma-memory
on-demand and not on initialization.


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-23 17:02   ` Borislav Petkov
@ 2015-01-26 12:11     ` Joerg Roedel
  2015-01-26 12:20       ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-01-26 12:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Fri, Jan 23, 2015 at 06:02:43PM +0100, Borislav Petkov wrote:
> On Tue, Jan 06, 2015 at 03:51:14PM +0100, Joerg Roedel wrote:
> > From: Joerg Roedel <jroedel@suse.de>
> > 
> > When the crashkernel is loaded above 4GiB in memory the
> > first kernel only allocates 72MiB of low-memory for the DMA
> > requirements of the second kernel. On systems with many
> > devices this is not enough and causes device driver
> > initialization errors and failed crash dumps. Set this
> > default value to 256MiB to make sure there is enough memory
> 
> This upper limit of 256 looks arbitrary. Are we going to raise it a
> couple of years from now if it becomes insufficient then?

Yes, it is arbitrary. I am open for suggestions on what might be a
proper value to satisfy most systems.

> It probably won't be easy but is there some more reliable way to
> allocate enough memory for DMA on a say per-system basis or whatever...?
> Probably not but let me ask it anyway.

Well, there is no easy way. But we could collect information from the
loaded drivers on boot about how many dma-memory they allocate and base
our allocation on that. Or we solve it in user-space by some more
cleverness in creating the kernel command-line for crashkernel=high.

But besides that, I think the first two patches of this set make sense
anyway. I understand that the third one is debatable.


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-26 12:11     ` Joerg Roedel
@ 2015-01-26 12:20       ` Borislav Petkov
  2015-01-26 12:40         ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2015-01-26 12:20 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Mon, Jan 26, 2015 at 01:11:42PM +0100, Joerg Roedel wrote:
> Well, there is no easy way. But we could collect information from the
> loaded drivers on boot about how many dma-memory they allocate and base
> our allocation on that.

That sounds like a nifty idea to me.

> Or we solve it in user-space by some more cleverness in creating the
> kernel command-line for crashkernel=high.

I'd say, we should try to do as much as possible automatically.

> But besides that, I think the first two patches of this set make sense
> anyway. I understand that the third one is debatable.

Right, but since they fix a real problem, maybe we should take them now
until a better fix is done? Yes, no?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-26 12:20       ` Borislav Petkov
@ 2015-01-26 12:40         ` Joerg Roedel
  2015-01-26 12:45           ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-01-26 12:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Mon, Jan 26, 2015 at 01:20:17PM +0100, Borislav Petkov wrote:
> On Mon, Jan 26, 2015 at 01:11:42PM +0100, Joerg Roedel wrote:
> > Or we solve it in user-space by some more cleverness in creating the
> > kernel command-line for crashkernel=high.
> 
> I'd say, we should try to do as much as possible automatically.

Thats hard to do, without any information about the driver needs we only
have the amount of devices in the system to base any heuristic on.

> > But besides that, I think the first two patches of this set make sense
> > anyway. I understand that the third one is debatable.
> 
> Right, but since they fix a real problem, maybe we should take them now
> until a better fix is done? Yes, no?

Yes. Given that we have no data about what would work on most
systems, we can only change the value to a number that fixes know
problems and then act on possible regressions caused by the change (and
that change is pretty easy to revert if needed).


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-26 12:40         ` Joerg Roedel
@ 2015-01-26 12:45           ` Borislav Petkov
  0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2015-01-26 12:45 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On Mon, Jan 26, 2015 at 01:40:06PM +0100, Joerg Roedel wrote:
> Yes. Given that we have no data about what would work on most
> systems, we can only change the value to a number that fixes know
> problems and then act on possible regressions caused by the change (and
> that change is pretty easy to revert if needed).

... and I think we can safely concentrate on kdump-specific usage anyway
and de-prioritize the others. Like not adding in the radeon module to
initramfs and such. I don't think we want to show a funky graphic while
dumping the first kernel.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-01-26 12:07     ` Joerg Roedel
@ 2015-02-01  8:41       ` Baoquan He
  2015-02-04 14:10         ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Baoquan He @ 2015-02-01  8:41 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On 01/26/15 at 01:07pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> thanks for your reply.
> 
> On Fri, Jan 23, 2015 at 04:44:53PM +0800, Baoquan He wrote:
> > 2) increase low-mem when crashkernel=high. But we have to be careful to
> > do this. We implement crashkernel=high not only for the unhappiness
> > crashkernel reservation is limited below 4G, but dma/dma32 memory space
> > is precious on some systems. If set crashkernel=high still too much low
> > memory has to be reserved by default, it's important to find the
> > balance. So if we have to increase the default low-mem, how much memory
> > is enough, why 256M?  why not 128M/192M/320M/384M?  And if 256M works
> > on your system, what if another person say it does't work because there
> > are more devices on his system?
> > 
> > Anyway, I understand the requirement, but we need find out how much
> > memory can satisfy most of systems.
> 
> Yes, I totally agree that it is tough to find a good default here. I
> used 256MB because this is what was required on the system the  failed
> kdumps were reported on.
> 
> But probably we can agree that 72MB is not enough (given that 64MB are
> taken away by swiotlb already), and increase it to a value we think by
> now is sufficient for most systems.

Yeah, and I got report from user about this issue too. It should be
fixed. Like I said, the 1st suggestion mainly will goes to the area of
initramfs making tools, currently maybe dracut which is used widely.
This may cause many changes. Hence increasing low mem is a better idea.

Before I said 256M may not be a good value, that's because in your patch
cover you said this number comes from experiments on the affected
systems, and 128M was still not enough, then you set it to 256M. This
may be a little rush. I think the step size to increase should be 32M,
after all previously people only take 64M and 8M, enlarge it on a step
size of 128M only one time, it can't be seen as patient and careful.
If it failed on 224M but succeed on 256M, then 256M may be not enough.
I would like to say 32M is better, then we can make a good evaluate.

I will ask user reported this issue to help test and see what value will
be satisfy their system.

Anyway, I think this patch is helpful and necessary.

> 
> Btw, the issue was also reported on machines with only a few devices,
> the reason there is that device drivers allocate more dma memory by
> default on intilization. Maybe we should handle that as a driver
> regression in the future, forcing them to allocate more dma-memory
> on-demand and not on initialization.

Yeah, agree. In that case it shoube be handled as a regression.


Thanks
Baoquan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-02-01  8:41       ` Baoquan He
@ 2015-02-04 14:10         ` Joerg Roedel
  2015-02-09 12:20           ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-02-04 14:10 UTC (permalink / raw)
  To: Baoquan He
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel

Hi Baoquan,

On Sun, Feb 01, 2015 at 04:41:03PM +0800, Baoquan He wrote:
> Before I said 256M may not be a good value, that's because in your patch
> cover you said this number comes from experiments on the affected
> systems, and 128M was still not enough, then you set it to 256M. This
> may be a little rush. I think the step size to increase should be 32M,
> after all previously people only take 64M and 8M, enlarge it on a step
> size of 128M only one time, it can't be seen as patient and careful.
> If it failed on 224M but succeed on 256M, then 256M may be not enough.
> I would like to say 32M is better, then we can make a good evaluate.

That makes sense. I also asked the customer to test intermediate values,
we already know that it works with 256MB but also that 128MB are not
enough. I will report back when I have the results of the intermediate
values in 32MB steps.

Thanks,

	Joerg
	

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-02-04 14:10         ` Joerg Roedel
@ 2015-02-09 12:20           ` Joerg Roedel
  2015-02-13 15:34             ` Baoquan He
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-02-09 12:20 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Baoquan He, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel

Hi Baoquan,

On Wed, Feb 04, 2015 at 03:10:20PM +0100, Joerg Roedel wrote:
> That makes sense. I also asked the customer to test intermediate values,
> we already know that it works with 256MB but also that 128MB are not
> enough. I will report back when I have the results of the intermediate
> values in 32MB steps.

I got the results from the customer, and it turns out that a value of
192MB is sufficient to make the kdump succeed. It fails with 128MB and
160MB. 

So I think we can settle in 192MB for now. What do you think?

Regards,

	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-02-09 12:20           ` Joerg Roedel
@ 2015-02-13 15:34             ` Baoquan He
  2015-02-13 22:28               ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Baoquan He @ 2015-02-13 15:34 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel

On 02/09/15 at 01:20pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> On Wed, Feb 04, 2015 at 03:10:20PM +0100, Joerg Roedel wrote:
> > That makes sense. I also asked the customer to test intermediate values,
> > we already know that it works with 256MB but also that 128MB are not
> > enough. I will report back when I have the results of the intermediate
> > values in 32MB steps.
> 
> I got the results from the customer, and it turns out that a value of
> 192MB is sufficient to make the kdump succeed. It fails with 128MB and
> 160MB. 
> 
> So I think we can settle in 192MB for now. What do you think?

Hi Joerg,

Sorry for late reply.

So that machine need eat memory between 160M and 192M. Then how about
setting it as 256M? Since 192M seems very close to the brink, and
setting it larger can keep enough space to extend for increasing dma
device on larger machines, otherwise this value could be increased soon.

Wasting some low memory is better than kdump kernel can't bootup
caused by not enough low memory. If someone grudge the low mem they can
specify crashkernel=size,low manually.

Conclusively, I like 256M since the testing data showed it's sufficient
now and should be save for a long time.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-02-13 15:34             ` Baoquan He
@ 2015-02-13 22:28               ` Joerg Roedel
  2015-02-14 11:44                 ` Baoquan He
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-02-13 22:28 UTC (permalink / raw)
  To: Baoquan He
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel

Hi Baoquan,

On Fri, Feb 13, 2015 at 11:34:38PM +0800, Baoquan He wrote:
> Conclusively, I like 256M since the testing data showed it's sufficient
> now and should be save for a long time.

Thanks, I am fine with 256MB too, so can I have your Acked-by on this
series? I will rebase and resend it then after the merge window in the
hope it gets queued.


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3 v2] Fix kdump failures with crashkernel=high
  2015-01-06 14:51 [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
                   ` (4 preceding siblings ...)
  2015-01-19 19:26 ` Borislav Petkov
@ 2015-02-14 10:58 ` Baoquan He
  2015-02-14 16:11   ` Joerg Roedel
  5 siblings, 1 reply; 34+ messages in thread
From: Baoquan He @ 2015-02-14 10:58 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

This patch is very helpful and necessary since several users complained
about the failure caused by not enough low mem. And the default value
256M is suitable since the testing data showed it's sufficient
now and should be save for a long time.
 
And it also makes sense to supress the warning from buddy allocation
failure which will call dump_stack in x86_swiotlb_alloc_coherent. Since
it tried buddy allocation several times before the final try of bounce
buffer allocation.

So ack the whole patch set.

Acked-by: Baoquan He <bhe@redhat.com> 

Hi Joerg,

Thanks for your effort on this issue. 

Could you please also update the cover letter or patch log to tell
how 256M comes out with the later test result? I think it is convincing
and helpful to understand.

Thanks
Baoquan

On 01/06/15 at 03:51pm, Joerg Roedel wrote:
> v1->v2:
> 
> * Updated comments based on feedback from Konrad
> * Added Acked-bys
> * Rebased to v3.19-rc3
> 
> Hi,
> 
> here is a patch-set to fix failed kdump kernel boots when
> the systems was booted with crashkernel=X,high. On those
> systems the kernel allocates only 72MiB of low-memory for
> DMA buffers, which showed to be too low on some systems.
> 
> The problem is that 64MiB of the low-memory is allocated by
> swiotlb, leaving 8MB for the page-allocator. But swiotlb
> tries to allocate DMA memory from the page-allocator first,
> which fails pretty fast in the boot sequence, causing
> warnings. This patch-set removes these warnings.
> 
> But even the 64MiB for swiotlb are eaten up on some systems,
> so that the default of low-memory allocated for the
> crash-kernel is increase from 72MB to 256MB (only changing
> the defaults, can still be overwritten by crashkernel=X,low).
> 
> This number comes from experiments on the affected systems,
> 128MiB low-memory was still not enough there, thus I set the
> value to 256MiB to fix the issues.
> 
> Any feedback appreciated.
> 
> Thanks,
> 
> 	Joerg
> 
> Joerg Roedel (3):
>   swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
>   x86, swiotlb: Try coherent allocations with __GFP_NOWARN
>   x86, crash: Allocate enough low-mem when crashkernel=high
> 
>  arch/x86/kernel/pci-swiotlb.c |  8 ++++++++
>  arch/x86/kernel/setup.c       |  5 ++++-
>  lib/swiotlb.c                 | 11 +++++++++--
>  3 files changed, 21 insertions(+), 3 deletions(-)
> 
> -- 
> 1.9.1
> 
> Joerg Roedel (3):
>   swiotlb: Warn on allocation failure in swiotlb_alloc_coherent
>   x86, swiotlb: Try coherent allocations with __GFP_NOWARN
>   x86, crash: Allocate enough low-mem when crashkernel=high
> 
>  arch/x86/kernel/pci-swiotlb.c |  8 ++++++++
>  arch/x86/kernel/setup.c       |  5 ++++-
>  lib/swiotlb.c                 | 11 +++++++++--
>  3 files changed, 21 insertions(+), 3 deletions(-)
> 
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high
  2015-02-13 22:28               ` Joerg Roedel
@ 2015-02-14 11:44                 ` Baoquan He
  0 siblings, 0 replies; 34+ messages in thread
From: Baoquan He @ 2015-02-14 11:44 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel

On 02/13/15 at 11:28pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> On Fri, Feb 13, 2015 at 11:34:38PM +0800, Baoquan He wrote:
> > Conclusively, I like 256M since the testing data showed it's sufficient
> > now and should be save for a long time.
> 
> Thanks, I am fine with 256MB too, so can I have your Acked-by on this
> series? I will rebase and resend it then after the merge window in the
> hope it gets queued.

Sure. Have acked it in reply to cover letter, feel free to add my
Acked-by in your resend.

Thanks a lot!

Thanks
Baoquan

> 
> 
> 	Joerg
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3 v2] Fix kdump failures with crashkernel=high
  2015-02-14 10:58 ` Baoquan He
@ 2015-02-14 16:11   ` Joerg Roedel
  2015-06-02  8:54     ` Baoquan He
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2015-02-14 16:11 UTC (permalink / raw)
  To: Baoquan He
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

Hi Baoquan,

On Sat, Feb 14, 2015 at 06:58:34PM +0800, Baoquan He wrote:
> This patch is very helpful and necessary since several users complained
> about the failure caused by not enough low mem. And the default value
> 256M is suitable since the testing data showed it's sufficient
> now and should be save for a long time.
>  
> And it also makes sense to supress the warning from buddy allocation
> failure which will call dump_stack in x86_swiotlb_alloc_coherent. Since
> it tried buddy allocation several times before the final try of bounce
> buffer allocation.
> 
> So ack the whole patch set.
> 
> Acked-by: Baoquan He <bhe@redhat.com> 

Thanks a lot!

> Hi Joerg,
> 
> Thanks for your effort on this issue. 
> 
> Could you please also update the cover letter or patch log to tell
> how 256M comes out with the later test result? I think it is convincing
> and helpful to understand.

Sure thing, will update the patch description before I resend the
series.


Regards,

	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3 v2] Fix kdump failures with crashkernel=high
  2015-02-14 16:11   ` Joerg Roedel
@ 2015-06-02  8:54     ` Baoquan He
  2015-06-02  9:08       ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Baoquan He @ 2015-06-02  8:54 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel, Joerg Roedel

On 02/14/15 at 05:11pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> On Sat, Feb 14, 2015 at 06:58:34PM +0800, Baoquan He wrote:
> > This patch is very helpful and necessary since several users complained
> > about the failure caused by not enough low mem. And the default value
> > 256M is suitable since the testing data showed it's sufficient
> > now and should be save for a long time.
> >  
> > And it also makes sense to supress the warning from buddy allocation
> > failure which will call dump_stack in x86_swiotlb_alloc_coherent. Since
> > it tried buddy allocation several times before the final try of bounce
> > buffer allocation.
> > 
> > So ack the whole patch set.
> > 
> > Acked-by: Baoquan He <bhe@redhat.com> 
> 
> Thanks a lot!
> 
> > Hi Joerg,
> > 
> > Thanks for your effort on this issue. 
> > 
> > Could you please also update the cover letter or patch log to tell
> > how 256M comes out with the later test result? I think it is convincing
> > and helpful to understand.
> 
> Sure thing, will update the patch description before I resend the
> series.

Hi Joerg,

Ping!

About this patchset, what's your plan?


Thanks
Baoquan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3 v2] Fix kdump failures with crashkernel=high
  2015-06-02  8:54     ` Baoquan He
@ 2015-06-02  9:08       ` Joerg Roedel
  0 siblings, 0 replies; 34+ messages in thread
From: Joerg Roedel @ 2015-06-02  9:08 UTC (permalink / raw)
  To: Baoquan He
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Konrad Rzeszutek Wilk, x86, linux-kernel

Hi Baoquan

On Tue, Jun 02, 2015 at 04:54:01PM +0800, Baoquan He wrote:
> Ping!
> 
> About this patchset, what's your plan?

Sorry, this went out of my scope. Thanks for reminding me, I'll send a
new version soon.


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2015-06-05 10:29 [PATCH 0/3 v3] " Joerg Roedel
@ 2015-06-05 10:30 ` Joerg Roedel
  0 siblings, 0 replies; 34+ messages in thread
From: Joerg Roedel @ 2015-06-05 10:30 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Borislav Petkov
  Cc: Konrad Rzeszutek Wilk, Vivek Goyal, Dave Young, Baoquan He, x86,
	kexec, joro, jroedel, linux-kernel

From: Joerg Roedel <jroedel@suse.de>

When we boot a kdump kernel in high memory, there is by
default only 72MB of low memory available. The swiotlb code
takes 64MB of it (by default) so that there are only 8MB
left to allocate from. On systems with many devices this
causes page allocator warnings from dma_generic_alloc_coherent():

systemd-udevd: page allocation failure: order:0, mode:0x280d4
CPU: 0 PID: 197 Comm: systemd-udevd Tainted: G        W       3.12.28-4-default #1
Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
 ffff8800781335e0 ffffffff8150b1db 00000000000280d4 ffffffff8113af90
 0000000000000000 0000000000000000 ffff88007efdbb00 0000000100000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000001
Call Trace:
  dump_trace+0x7d/0x2d0
  show_stack_log_lvl+0x94/0x170
  show_stack+0x21/0x50
  dump_stack+0x41/0x51
  warn_alloc_failed+0xf0/0x160
  __alloc_pages_slowpath+0x72f/0x796
  __alloc_pages_nodemask+0x1ea/0x210
  dma_generic_alloc_coherent+0x96/0x140
  x86_swiotlb_alloc_coherent+0x1c/0x50
  ttm_dma_pool_alloc_new_pages+0xab/0x320 [ttm]
  ttm_dma_populate+0x3ce/0x640 [ttm]
  ttm_tt_bind+0x36/0x60 [ttm]
  ttm_bo_handle_move_mem+0x55f/0x5c0 [ttm]
  ttm_bo_move_buffer+0x105/0x130 [ttm]
  ttm_bo_validate+0xc1/0x130 [ttm]
  ttm_bo_init+0x24b/0x400 [ttm]
  radeon_bo_create+0x16c/0x200 [radeon]
  radeon_ring_init+0x11e/0x2b0 [radeon]
  r100_cp_init+0x123/0x5b0 [radeon]
  r100_startup+0x194/0x230 [radeon]
  r100_init+0x223/0x410 [radeon]
  radeon_device_init+0x6af/0x830 [radeon]
  radeon_driver_load_kms+0x89/0x180 [radeon]
  drm_get_pci_dev+0x121/0x2f0 [drm]
  local_pci_probe+0x39/0x60
  pci_device_probe+0xa9/0x120
  driver_probe_device+0x9d/0x3d0
  __driver_attach+0x8b/0x90
  bus_for_each_dev+0x5b/0x90
  bus_add_driver+0x1f8/0x2c0
  driver_register+0x5b/0xe0
  do_one_initcall+0xf2/0x1a0
  load_module+0x1207/0x1c70
  SYSC_finit_module+0x75/0xa0
  system_call_fastpath+0x16/0x1b
  0x7fac533d2788

After these warnings the code enters a fall-back path and
allocated directly from the swiotlb aperture in the end.
So remove these warnings as this is not a fatal error.

Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/pci-swiotlb.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 77dd0ad..6d6894c 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -20,6 +20,13 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 {
 	void *vaddr;
 
+	/*
+         * Don't print a warning when the first allocation attempt
+         * fails. The swiotlb_alloc_coherent() function will print a
+         * warning when the allocation of DMA memory ultimatly failed.
+	 */
+	flags |= __GFP_NOWARN;
+
 	vaddr = dma_generic_alloc_coherent(hwdev, size, dma_handle, flags,
 					   attrs);
 	if (vaddr)
-- 
1.9.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2014-12-02 18:46       ` Konrad Rzeszutek Wilk
@ 2014-12-03 10:27         ` Joerg Roedel
  0 siblings, 0 replies; 34+ messages in thread
From: Joerg Roedel @ 2014-12-03 10:27 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86,
	linux-kernel

On Tue, Dec 02, 2014 at 01:46:48PM -0500, Konrad Rzeszutek Wilk wrote:
> On Tue, Dec 02, 2014 at 03:45:51PM +0100, Joerg Roedel wrote:
> > On Mon, Dec 01, 2014 at 03:28:54PM -0500, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Nov 28, 2014 at 12:29:08PM +0100, Joerg Roedel wrote:
> > > > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> > > > index 77dd0ad..79b2291 100644
> > > > --- a/arch/x86/kernel/pci-swiotlb.c
> > > > +++ b/arch/x86/kernel/pci-swiotlb.c
> > > > @@ -20,6 +20,14 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
> > > >  {
> > > >  	void *vaddr;
> > > >  
> > > > +	/*
> > > > +	 * When booting a kdump kernel in high memory these allocations are very
> > > > +	 * likely to fail, as there are by default only 8MB of low memory to
> > > > +	 * allocate from. So disable the warnings from the allocator when this
> > > > +	 * happens.  SWIOTLB also implements fall-backs for failed allocations.
> > > > +	 */
> > > > +	flags |= __GFP_NOWARN;
> > > 
> > > Should this perhaps then have 'if (kdump_kernel)' around it since
> > > the use-case seems to be kdump related?
> > 
> > Hmm, I don't think this is entirely kdump specific. It can also be
> > triggered on a non-kdump kernel, it is just much more unlikely. But
> > maybe I should change the comment to something like:
> > 
> > 	/*
> > 	 * Don't print a warning when the first allocation attempt
> > 	 * fails. The swiotlb_alloc_coherent() function will print a
> > 	 * warning when the allocation of DMA memory ultimatly failed.
> > 	 */
> 
> Much better. Thank you.
> > 
> > This takes the kdump-specifics out of this change (in the end
> > kdump-kernel loaded high is just a case where this failure is much more
> > likely).
> 
> <nods>

Okay, thanks. I'll update the patch.


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2014-12-02 14:45     ` Joerg Roedel
@ 2014-12-02 18:46       ` Konrad Rzeszutek Wilk
  2014-12-03 10:27         ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-12-02 18:46 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86,
	linux-kernel

On Tue, Dec 02, 2014 at 03:45:51PM +0100, Joerg Roedel wrote:
> On Mon, Dec 01, 2014 at 03:28:54PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Fri, Nov 28, 2014 at 12:29:08PM +0100, Joerg Roedel wrote:
> > > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> > > index 77dd0ad..79b2291 100644
> > > --- a/arch/x86/kernel/pci-swiotlb.c
> > > +++ b/arch/x86/kernel/pci-swiotlb.c
> > > @@ -20,6 +20,14 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
> > >  {
> > >  	void *vaddr;
> > >  
> > > +	/*
> > > +	 * When booting a kdump kernel in high memory these allocations are very
> > > +	 * likely to fail, as there are by default only 8MB of low memory to
> > > +	 * allocate from. So disable the warnings from the allocator when this
> > > +	 * happens.  SWIOTLB also implements fall-backs for failed allocations.
> > > +	 */
> > > +	flags |= __GFP_NOWARN;
> > 
> > Should this perhaps then have 'if (kdump_kernel)' around it since
> > the use-case seems to be kdump related?
> 
> Hmm, I don't think this is entirely kdump specific. It can also be
> triggered on a non-kdump kernel, it is just much more unlikely. But
> maybe I should change the comment to something like:
> 
> 	/*
> 	 * Don't print a warning when the first allocation attempt
> 	 * fails. The swiotlb_alloc_coherent() function will print a
> 	 * warning when the allocation of DMA memory ultimatly failed.
> 	 */

Much better. Thank you.
> 
> This takes the kdump-specifics out of this change (in the end
> kdump-kernel loaded high is just a case where this failure is much more
> likely).

<nods>
> 
> 
> 	Joerg
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2014-12-01 20:28   ` Konrad Rzeszutek Wilk
@ 2014-12-02 14:45     ` Joerg Roedel
  2014-12-02 18:46       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2014-12-02 14:45 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Joerg Roedel, Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86,
	linux-kernel

On Mon, Dec 01, 2014 at 03:28:54PM -0500, Konrad Rzeszutek Wilk wrote:
> On Fri, Nov 28, 2014 at 12:29:08PM +0100, Joerg Roedel wrote:
> > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> > index 77dd0ad..79b2291 100644
> > --- a/arch/x86/kernel/pci-swiotlb.c
> > +++ b/arch/x86/kernel/pci-swiotlb.c
> > @@ -20,6 +20,14 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
> >  {
> >  	void *vaddr;
> >  
> > +	/*
> > +	 * When booting a kdump kernel in high memory these allocations are very
> > +	 * likely to fail, as there are by default only 8MB of low memory to
> > +	 * allocate from. So disable the warnings from the allocator when this
> > +	 * happens.  SWIOTLB also implements fall-backs for failed allocations.
> > +	 */
> > +	flags |= __GFP_NOWARN;
> 
> Should this perhaps then have 'if (kdump_kernel)' around it since
> the use-case seems to be kdump related?

Hmm, I don't think this is entirely kdump specific. It can also be
triggered on a non-kdump kernel, it is just much more unlikely. But
maybe I should change the comment to something like:

	/*
	 * Don't print a warning when the first allocation attempt
	 * fails. The swiotlb_alloc_coherent() function will print a
	 * warning when the allocation of DMA memory ultimatly failed.
	 */

This takes the kdump-specifics out of this change (in the end
kdump-kernel loaded high is just a case where this failure is much more
likely).


	Joerg


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2014-11-28 11:29 ` [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN Joerg Roedel
@ 2014-12-01 20:28   ` Konrad Rzeszutek Wilk
  2014-12-02 14:45     ` Joerg Roedel
  0 siblings, 1 reply; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-12-01 20:28 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86, linux-kernel,
	Joerg Roedel

On Fri, Nov 28, 2014 at 12:29:08PM +0100, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> When we boot a kdump kernel in high memory, there is by
> default only 72MB of low memory available. The swiotlb code
> takes 64MB of it (by default) so that there are only 8MB
> left to allocate from. On systems with many devices this
> causes page allocator warnings from dma_generic_alloc_coherent():
> 
> systemd-udevd: page allocation failure: order:0, mode:0x280d4
> CPU: 0 PID: 197 Comm: systemd-udevd Tainted: G        W       3.12.28-4-default #1
> Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
>  ffff8800781335e0 ffffffff8150b1db 00000000000280d4 ffffffff8113af90
>  0000000000000000 0000000000000000 ffff88007efdbb00 0000000100000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000001
> Call Trace:
>  [<ffffffff8100467d>] dump_trace+0x7d/0x2d0
>  [<ffffffff81004964>] show_stack_log_lvl+0x94/0x170
>  [<ffffffff81005d91>] show_stack+0x21/0x50
>  [<ffffffff8150b1db>] dump_stack+0x41/0x51
>  [<ffffffff8113af90>] warn_alloc_failed+0xf0/0x160
>  [<ffffffff8150763a>] __alloc_pages_slowpath+0x72f/0x796
>  [<ffffffff8113ee7a>] __alloc_pages_nodemask+0x1ea/0x210
>  [<ffffffff81008256>] dma_generic_alloc_coherent+0x96/0x140
>  [<ffffffff8103fccc>] x86_swiotlb_alloc_coherent+0x1c/0x50
>  [<ffffffffa048ae5b>] ttm_dma_pool_alloc_new_pages+0xab/0x320 [ttm]
>  [<ffffffffa048bc6e>] ttm_dma_populate+0x3ce/0x640 [ttm]
>  [<ffffffffa0486>] ttm_tt_bind+0x36/0x60 [ttm]
>  [<ffffffffa0484faf>] ttm_bo_handle_move_mem+0x55f/0x5c0 [ttm]
>  [<ffffffffa0485be5>] ttm_bo_move_buffer+0x105/0x130 [ttm]
>  [<ffffffffa0485cd1>] ttm_bo_validate+0xc1/0x130 [ttm]
>  [<ffffffffa0485f8b>] ttm_bo_init+0x24b/0x400 [ttm]
>  [<ffffffffa054f8bc>] radeon_bo_create+0x16c/0x200 [radeon]
>  [<ffffffffa0563c8e>] radeon_ring_init+0x11e/0x2b0 [radeon]
>  [<ffffffffa056c143>] r100_cp_init+0x123/0x5b0 [radeon]
>  [<ffffffffa056e8e4>] r100_startup+0x194/0x230 [radeon]
>  [<ffffffffa056ece3>] r100_init+0x223/0x410 [radeon]
>  [<ffffffffa053495f>] radeon_device_init+0x6af/0x830 [radeon]
>  [<ffffffffa0536979>] radeon_driver_load_kms+0x89/0x180 [radeon]
>  [<ffffffffa04eeb31>] drm_get_pci_dev+0x121/0x2f0 [drm]
>  [<ffffffff812d3ec9>ocal_pci_probe+0x39/0x60
>  [<ffffffff812d51e9>] pci_device_probe+0xa9/0x120
>  [<ffffffff8139871d>] driver_probe_device+0x9d/0x3d0
>  [<ffffffff81398b1b>] __driver_attach+0x8b/0x90
>  [<ffffffff8139667b>] bus_for_each_dev+0x5b/0x90
>  [<ffffffff81397cd8>] bus_add_driver+0x1f8/0x2c0
>  [<ffffffff8139911b>] driver_register+0x5b/0xe0
>  [<ffffffff810002c2>] do_one_initcall+0xf2/0x1a0
>  [<ffffffff810c8837>] load_module+0x1207/0x1c70
>  [<ffffffff810c93f5>] SYSC_finit_module+0x75/0xa0
>  [<ffffffff81519329>] system_call_fastpath+0x16/0x1b
>  [<00007fac533d2789>] 0x7fac533d2788
> 
> After these warnings the code enters a fall-back path and
> allocated directly from the swiotlb aperture in the end.
> So remove these warnings as this is not a fatal error.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  arch/x86/kernel/pci-swiotlb.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> index 77dd0ad..79b2291 100644
> --- a/arch/x86/kernel/pci-swiotlb.c
> +++ b/arch/x86/kernel/pci-swiotlb.c
> @@ -20,6 +20,14 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  {
>  	void *vaddr;
>  
> +	/*
> +	 * When booting a kdump kernel in high memory these allocations are very
> +	 * likely to fail, as there are by default only 8MB of low memory to
> +	 * allocate from. So disable the warnings from the allocator when this
> +	 * happens.  SWIOTLB also implements fall-backs for failed allocations.
> +	 */
> +	flags |= __GFP_NOWARN;

Should this perhaps then have 'if (kdump_kernel)' around it since
the use-case seems to be kdump related?

> +
>  	vaddr = dma_generic_alloc_coherent(hwdev, size, dma_handle, flags,
>  					   attrs);
>  	if (vaddr)
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN
  2014-11-28 11:29 [PATCH 0/3] Fix kdump failures with crashkernel=high Joerg Roedel
@ 2014-11-28 11:29 ` Joerg Roedel
  2014-12-01 20:28   ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 34+ messages in thread
From: Joerg Roedel @ 2014-11-28 11:29 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Konrad Rzeszutek Wilk
  Cc: x86, linux-kernel, Joerg Roedel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

When we boot a kdump kernel in high memory, there is by
default only 72MB of low memory available. The swiotlb code
takes 64MB of it (by default) so that there are only 8MB
left to allocate from. On systems with many devices this
causes page allocator warnings from dma_generic_alloc_coherent():

systemd-udevd: page allocation failure: order:0, mode:0x280d4
CPU: 0 PID: 197 Comm: systemd-udevd Tainted: G        W       3.12.28-4-default #1
Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
 ffff8800781335e0 ffffffff8150b1db 00000000000280d4 ffffffff8113af90
 0000000000000000 0000000000000000 ffff88007efdbb00 0000000100000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000001
Call Trace:
 [<ffffffff8100467d>] dump_trace+0x7d/0x2d0
 [<ffffffff81004964>] show_stack_log_lvl+0x94/0x170
 [<ffffffff81005d91>] show_stack+0x21/0x50
 [<ffffffff8150b1db>] dump_stack+0x41/0x51
 [<ffffffff8113af90>] warn_alloc_failed+0xf0/0x160
 [<ffffffff8150763a>] __alloc_pages_slowpath+0x72f/0x796
 [<ffffffff8113ee7a>] __alloc_pages_nodemask+0x1ea/0x210
 [<ffffffff81008256>] dma_generic_alloc_coherent+0x96/0x140
 [<ffffffff8103fccc>] x86_swiotlb_alloc_coherent+0x1c/0x50
 [<ffffffffa048ae5b>] ttm_dma_pool_alloc_new_pages+0xab/0x320 [ttm]
 [<ffffffffa048bc6e>] ttm_dma_populate+0x3ce/0x640 [ttm]
 [<ffffffffa0486>] ttm_tt_bind+0x36/0x60 [ttm]
 [<ffffffffa0484faf>] ttm_bo_handle_move_mem+0x55f/0x5c0 [ttm]
 [<ffffffffa0485be5>] ttm_bo_move_buffer+0x105/0x130 [ttm]
 [<ffffffffa0485cd1>] ttm_bo_validate+0xc1/0x130 [ttm]
 [<ffffffffa0485f8b>] ttm_bo_init+0x24b/0x400 [ttm]
 [<ffffffffa054f8bc>] radeon_bo_create+0x16c/0x200 [radeon]
 [<ffffffffa0563c8e>] radeon_ring_init+0x11e/0x2b0 [radeon]
 [<ffffffffa056c143>] r100_cp_init+0x123/0x5b0 [radeon]
 [<ffffffffa056e8e4>] r100_startup+0x194/0x230 [radeon]
 [<ffffffffa056ece3>] r100_init+0x223/0x410 [radeon]
 [<ffffffffa053495f>] radeon_device_init+0x6af/0x830 [radeon]
 [<ffffffffa0536979>] radeon_driver_load_kms+0x89/0x180 [radeon]
 [<ffffffffa04eeb31>] drm_get_pci_dev+0x121/0x2f0 [drm]
 [<ffffffff812d3ec9>ocal_pci_probe+0x39/0x60
 [<ffffffff812d51e9>] pci_device_probe+0xa9/0x120
 [<ffffffff8139871d>] driver_probe_device+0x9d/0x3d0
 [<ffffffff81398b1b>] __driver_attach+0x8b/0x90
 [<ffffffff8139667b>] bus_for_each_dev+0x5b/0x90
 [<ffffffff81397cd8>] bus_add_driver+0x1f8/0x2c0
 [<ffffffff8139911b>] driver_register+0x5b/0xe0
 [<ffffffff810002c2>] do_one_initcall+0xf2/0x1a0
 [<ffffffff810c8837>] load_module+0x1207/0x1c70
 [<ffffffff810c93f5>] SYSC_finit_module+0x75/0xa0
 [<ffffffff81519329>] system_call_fastpath+0x16/0x1b
 [<00007fac533d2789>] 0x7fac533d2788

After these warnings the code enters a fall-back path and
allocated directly from the swiotlb aperture in the end.
So remove these warnings as this is not a fatal error.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/pci-swiotlb.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 77dd0ad..79b2291 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -20,6 +20,14 @@ void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 {
 	void *vaddr;
 
+	/*
+	 * When booting a kdump kernel in high memory these allocations are very
+	 * likely to fail, as there are by default only 8MB of low memory to
+	 * allocate from. So disable the warnings from the allocator when this
+	 * happens.  SWIOTLB also implements fall-backs for failed allocations.
+	 */
+	flags |= __GFP_NOWARN;
+
 	vaddr = dma_generic_alloc_coherent(hwdev, size, dma_handle, flags,
 					   attrs);
 	if (vaddr)
-- 
1.9.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2015-06-05 10:31 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-06 14:51 [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
2015-01-06 14:51 ` [PATCH 1/3] swiotlb: Warn on allocation failure in swiotlb_alloc_coherent Joerg Roedel
2015-01-23 17:04   ` Borislav Petkov
2015-01-26 11:49     ` Joerg Roedel
2015-01-06 14:51 ` [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN Joerg Roedel
2015-01-23 17:03   ` Borislav Petkov
2015-01-26  3:22     ` Baoquan He
2015-01-26 11:54       ` Joerg Roedel
2015-01-06 14:51 ` [PATCH 3/3] x86, crash: Allocate enough low-mem when crashkernel=high Joerg Roedel
2015-01-23  8:44   ` Baoquan He
2015-01-26 12:07     ` Joerg Roedel
2015-02-01  8:41       ` Baoquan He
2015-02-04 14:10         ` Joerg Roedel
2015-02-09 12:20           ` Joerg Roedel
2015-02-13 15:34             ` Baoquan He
2015-02-13 22:28               ` Joerg Roedel
2015-02-14 11:44                 ` Baoquan He
2015-01-23 17:02   ` Borislav Petkov
2015-01-26 12:11     ` Joerg Roedel
2015-01-26 12:20       ` Borislav Petkov
2015-01-26 12:40         ` Joerg Roedel
2015-01-26 12:45           ` Borislav Petkov
2015-01-14 13:35 ` [PATCH 0/3 v2] Fix kdump failures with crashkernel=high Joerg Roedel
2015-01-19 19:26 ` Borislav Petkov
2015-02-14 10:58 ` Baoquan He
2015-02-14 16:11   ` Joerg Roedel
2015-06-02  8:54     ` Baoquan He
2015-06-02  9:08       ` Joerg Roedel
  -- strict thread matches above, loose matches on Subject: below --
2015-06-05 10:29 [PATCH 0/3 v3] " Joerg Roedel
2015-06-05 10:30 ` [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN Joerg Roedel
2014-11-28 11:29 [PATCH 0/3] Fix kdump failures with crashkernel=high Joerg Roedel
2014-11-28 11:29 ` [PATCH 2/3] x86, swiotlb: Try coherent allocations with __GFP_NOWARN Joerg Roedel
2014-12-01 20:28   ` Konrad Rzeszutek Wilk
2014-12-02 14:45     ` Joerg Roedel
2014-12-02 18:46       ` Konrad Rzeszutek Wilk
2014-12-03 10:27         ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).