LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] x86_64: fix overlap between pagetable with bss section
@ 2008-01-29  2:44 Yinghai Lu
  2008-01-29  4:05 ` [PATCH] x86_64: fix overlap between pagetable with bss section v2 Yinghai Lu
  2008-01-29 17:42 ` [PATCH] x86_64: fix overlap between pagetable with bss section Ingo Molnar
  0 siblings, 2 replies; 5+ messages in thread
From: Yinghai Lu @ 2008-01-29  2:44 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, Linux Kernel Mailing List

[PATCH] x86_64: fix overlap between pagetable with bss section

one early crash on one 8 node 256g machine

Command line: console=uart8250,io,0x3f8,115200n8 initrd=kernel.org/mydisk11_x86_64.gz rw root=/dev/ram0 debug initcall_debug apic=debug acpi.debug_level=0x0000000f pci=routeirq ip=dhcp load_ramdisk=1 ramdisk_size=131072 BOOT_IMAGE=kernel.org/bzImage_2.6.25_k8.1
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009bc00 (usable)
 BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000dffe0000 (usable)
 BIOS-e820: 00000000dffe0000 - 00000000dffee000 (ACPI data)
 BIOS-e820: 00000000dffee000 - 00000000dffff050 (ACPI NVS)
 BIOS-e820: 00000000dffff050 - 00000000e0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000004020000000 (usable)
Early serial console at I/O port 0x3f8 (options '115200n8')
console [uart0] enabled
end_pfn_map = 67239936
Kernel panic - not syncing: Duplicated early reservation d40000-e42000

Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #3

Call Trace:
 [<ffffffff80221545>] lapic_get_maxlvt+0x0/0x10
 [<ffffffff80221657>] clear_local_APIC+0x5/0xcf
 [<ffffffff80221726>] disable_local_APIC+0x5/0x17
 [<ffffffff8021fe16>] smp_send_stop+0x46/0x4c
 [<ffffffff80235293>] panic+0x94/0x13e
 [<ffffffff80bc3b03>] sctp_eps_proc_init+0x12/0x34
 [<ffffffff80b9f1c5>] reserve_early+0x30/0x6c
 [<ffffffff80803925>] init_memory_mapping+0x2cd/0x2dc
 [<ffffffff80b9dc01>] setup_arch+0x21f/0x44e
 [<ffffffff80b978be>] start_kernel+0x6f/0x2c7
 [<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3

one later oops on other machine

Calling initcall 0xffffffff80bc33ac: sctp_init+0x0/0x711()
BUG: unable to handle kernel NULL pointer dereference at 000000000000005f
IP: [<ffffffff802bfe55>] proc_register+0xe7/0x10f
PGD 0
Oops: 0000 [1] SMP
CPU 7
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #1
RIP: 0010:[<ffffffff802bfe55>]  [<ffffffff802bfe55>] proc_register+0xe7/0x10f
RSP: 0000:ffff811074c55e60  EFLAGS: 00010246
RAX: 0000000000008d8d RBX: ffff811074d78d80 RCX: ffff811074c55e08
RDX: 0000000000000000 RSI: 0000000000000141 RDI: ffffffff80cc2460
RBP: ffffffffffffffff R08: 0000000000000000 R09: ffff811074d78d80
R10: 0000000000000000 R11: ffffffff80b78750 R12: ffff811074c55e6c
R13: 0000000000000000 R14: ffff811074c55ee0 R15: 00000006eb27426e
FS:  0000000000000000(0000) GS:ffff811074cc7f00(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000000000005f CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff811074c54000, task ffff810874c54000)
Stack:  ffffffff80a57340 0000014100000000 ffff811074d78d80 0000000000000000
 00000000ffffff97 ffffffff802bfef0 0000000000000000 ffffffffffffffff
 0000000000000000 ffffffff80bc3b41 ffff811074c55ee0 ffffffff80bc349b
Call Trace:
 [<ffffffff802bfef0>] ? create_proc_entry+0x73/0x8a
 [<ffffffff80bc3b41>] ? sctp_snmp_proc_init+0x1c/0x34
 [<ffffffff80bc349b>] ? sctp_init+0xef/0x711
 [<ffffffff80b976e3>] ? kernel_init+0x175/0x2e1
 [<ffffffff8020ccf8>] ? child_rip+0xa/0x12
 [<ffffffff80b9756e>] ? kernel_init+0x0/0x2e1
 [<ffffffff8020ccee>] ? child_rip+0x0/0x12


Code: 1e 48 83 7b 38 00 75 08 48 c7 43 38 f0 e8 82 80 48 83 7b 30 00 75 08 48 c7 43 30 d0 e9 82 80 48 c7 c7 60 24 cc 80 e8 bd 5a 54 00 <48> 8b 45 60 48 89 6b 58 48 89 5d 60 48 89 43 50 fe 05 f5 25 a0
RIP  [<ffffffff802bfe55>] proc_register+0xe7/0x10f
 RSP <ffff811074c55e60>
CR2: 000000000000005f
---[ end trace c97bfb5810c69e0c ]---
Kernel panic - not syncing: Attempted to kill init!

it turns out there is overlap between pgtable and bss...

need to round up table_start to PAGE

also make the panic more informative.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>

diff --git a/arch/x86/kernel/e820_64.c b/arch/x86/kernel/e820_64.c
index f8b7beb..6f07bab 100644
--- a/arch/x86/kernel/e820_64.c
+++ b/arch/x86/kernel/e820_64.c
@@ -70,8 +70,8 @@ void __init reserve_early(unsigned long start, unsigned long end)
 	for (i = 0; i < MAX_EARLY_RES && early_res[i].end; i++) {
 		r = &early_res[i];
 		if (end > r->start && start < r->end)
-			panic("Duplicated early reservation %lx-%lx\n",
-			      start, end);
+			panic("Overlap early reservation %lx-%lx to %lx-%lx\n",
+			      start, end, r->start, r->end);
 	}
 	if (i >= MAX_EARLY_RES)
 		panic("Too many early reservations");
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b09faf2..bf02f7e 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -358,6 +358,8 @@ static void __init find_early_table_space(unsigned long end)
 	if (table_start == -1UL)
 		panic("Cannot find space for the kernel page tables");
 
+	/* need to round it up to avoid overlap less one page */
+	table_start = round_up(table_start, PAGE_SIZE);
 	table_start >>= PAGE_SHIFT;
 	table_end = table_start;
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] x86_64: fix overlap between pagetable with bss section v2
  2008-01-29  2:44 [PATCH] x86_64: fix overlap between pagetable with bss section Yinghai Lu
@ 2008-01-29  4:05 ` Yinghai Lu
  2008-01-29 17:42 ` [PATCH] x86_64: fix overlap between pagetable with bss section Ingo Molnar
  1 sibling, 0 replies; 5+ messages in thread
From: Yinghai Lu @ 2008-01-29  4:05 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, Linux Kernel Mailing List

[PATCH] x86_64: fix overlap between pagetable with bss section v2

one early crash on one 8 node 256g machine

Command line: console=uart8250,io,0x3f8,115200n8 initrd=kernel.org/mydisk11_x86_64.gz rw root=/dev/ram0 debug initcall_debug apic=debug acpi.debug_level=0x0000000f pci=routeirq ip=dhcp load_ramdisk=1 ramdisk_size=131072 BOOT_IMAGE=kernel.org/bzImage_2.6.25_k8.1
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009bc00 (usable)
 BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000dffe0000 (usable)
 BIOS-e820: 00000000dffe0000 - 00000000dffee000 (ACPI data)
 BIOS-e820: 00000000dffee000 - 00000000dffff050 (ACPI NVS)
 BIOS-e820: 00000000dffff050 - 00000000e0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000004020000000 (usable)
Early serial console at I/O port 0x3f8 (options '115200n8')
console [uart0] enabled
end_pfn_map = 67239936
Kernel panic - not syncing: Duplicated early reservation d40000-e42000

Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #3

Call Trace:
 [<ffffffff80221545>] lapic_get_maxlvt+0x0/0x10
 [<ffffffff80221657>] clear_local_APIC+0x5/0xcf
 [<ffffffff80221726>] disable_local_APIC+0x5/0x17
 [<ffffffff8021fe16>] smp_send_stop+0x46/0x4c
 [<ffffffff80235293>] panic+0x94/0x13e
 [<ffffffff80bc3b03>] sctp_eps_proc_init+0x12/0x34
 [<ffffffff80b9f1c5>] reserve_early+0x30/0x6c
 [<ffffffff80803925>] init_memory_mapping+0x2cd/0x2dc
 [<ffffffff80b9dc01>] setup_arch+0x21f/0x44e
 [<ffffffff80b978be>] start_kernel+0x6f/0x2c7
 [<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3

it turns out there is overlap between pgtable and bss...

in System.map we have
ffffffff80d40420 b rsi_table
ffffffff80d40620 B krb5_seq_lock
ffffffff80d40628 b i.20437
ffffffff80d40630 b xprt_rdma_inline_write_padding
ffffffff80d40638 b sunrpc_table_header
ffffffff80d40640 b zero
ffffffff80d40644 b min_memreg
ffffffff80d40648 b rpcrdma_tk_lock_g
ffffffff80d40650 B sctp_assocs_id_lock
ffffffff80d40658 B proc_net_sctp
ffffffff80d40660 B sctp_assocs_id
ffffffff80d40680 B sysctl_sctp_mem
ffffffff80d40690 B sysctl_sctp_rmem
ffffffff80d406a0 B sysctl_sctp_wmem
ffffffff80d406b0 b sctp_ctl_socket
ffffffff80d406b8 b sctp_pf_inet6_specific
ffffffff80d406c0 b sctp_pf_inet_specific
ffffffff80d406c8 b sctp_af_v4_specific
ffffffff80d406d0 b sctp_af_v6_specific
ffffffff80d406d8 b sctp_rand.33270
ffffffff80d406dc b sctp_memory_pressure
ffffffff80d406e0 b sctp_sockets_allocated
ffffffff80d406e4 b sctp_memory_allocated
ffffffff80d406e8 b sctp_sysctl_header
ffffffff80d406f0 b zero
ffffffff80d406f4 A __bss_stop
ffffffff80d406f4 A _end

need to round up table_start to PAGE_SIZE

also make the panic more informative.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>

Index: linux-2.6/arch/x86/kernel/e820_64.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/e820_64.c
+++ linux-2.6/arch/x86/kernel/e820_64.c
@@ -70,8 +70,8 @@ void __init reserve_early(unsigned long 
 	for (i = 0; i < MAX_EARLY_RES && early_res[i].end; i++) {
 		r = &early_res[i];
 		if (end > r->start && start < r->end)
-			panic("Duplicated early reservation %lx-%lx\n",
-			      start, end);
+			panic("Overlap early reservation %lx-%lx to %lx-%lx\n",
+			      start, end, r->start, r->end);
 	}
 	if (i >= MAX_EARLY_RES)
 		panic("Too many early reservations");
Index: linux-2.6/arch/x86/mm/init_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_64.c
+++ linux-2.6/arch/x86/mm/init_64.c
@@ -358,6 +358,13 @@ static void __init find_early_table_spac
 	if (table_start == -1UL)
 		panic("Cannot find space for the kernel page tables");
 
+	/*
+	 * when you have a lot of ram like 256g, early_table will not fit
+	 * into 0x8000 range, find_e820_area will find area after kerne bss
+	 * but the table_start is not page align, so need to round it up to
+	 * avoid overlap with bss
+	 */
+	table_start = round_up(table_start, PAGE_SIZE);
 	table_start >>= PAGE_SHIFT;
 	table_end = table_start;
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86_64: fix overlap between pagetable with bss section
  2008-01-29  2:44 [PATCH] x86_64: fix overlap between pagetable with bss section Yinghai Lu
  2008-01-29  4:05 ` [PATCH] x86_64: fix overlap between pagetable with bss section v2 Yinghai Lu
@ 2008-01-29 17:42 ` Ingo Molnar
  2008-01-29 18:03   ` Yinghai Lu
  1 sibling, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2008-01-29 17:42 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Linux Kernel Mailing List, Thomas Gleixner,
	H. Peter Anvin


* Yinghai Lu <Yinghai.Lu@Sun.COM> wrote:

> [PATCH] x86_64: fix overlap between pagetable with bss section
> 
> one early crash on one 8 node 256g machine

> +++ b/arch/x86/mm/init_64.c
> @@ -358,6 +358,8 @@ static void __init find_early_table_space(unsigned long end)
>  	if (table_start == -1UL)
>  		panic("Cannot find space for the kernel page tables");
>  
> +	/* need to round it up to avoid overlap less one page */
> +	table_start = round_up(table_start, PAGE_SIZE);
>  	table_start >>= PAGE_SHIFT;
>  	table_end = table_start;

thanks, applied.

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86_64: fix overlap between pagetable with bss section
  2008-01-29 18:03   ` Yinghai Lu
@ 2008-01-29 17:47     ` Ingo Molnar
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2008-01-29 17:47 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Linux Kernel Mailing List, Thomas Gleixner,
	H. Peter Anvin


* Yinghai Lu <Yinghai.Lu@Sun.COM> wrote:

> > > +	/* need to round it up to avoid overlap less one page */
> > > +	table_start = round_up(table_start, PAGE_SIZE);
> > >  	table_start >>= PAGE_SHIFT;
> > >  	table_end = table_start;
> > 
> > thanks, applied.
> 
> can you use v2 instead? v2 have more comments.

yes, i have v2.

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86_64: fix overlap between pagetable with bss section
  2008-01-29 17:42 ` [PATCH] x86_64: fix overlap between pagetable with bss section Ingo Molnar
@ 2008-01-29 18:03   ` Yinghai Lu
  2008-01-29 17:47     ` Ingo Molnar
  0 siblings, 1 reply; 5+ messages in thread
From: Yinghai Lu @ 2008-01-29 18:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Linux Kernel Mailing List, Thomas Gleixner,
	H. Peter Anvin

On Tuesday 29 January 2008 09:42:43 am Ingo Molnar wrote:
> 
> * Yinghai Lu <Yinghai.Lu@Sun.COM> wrote:
> 
> > [PATCH] x86_64: fix overlap between pagetable with bss section
> > 
> > one early crash on one 8 node 256g machine
> 
> > +++ b/arch/x86/mm/init_64.c
> > @@ -358,6 +358,8 @@ static void __init find_early_table_space(unsigned long end)
> >  	if (table_start == -1UL)
> >  		panic("Cannot find space for the kernel page tables");
> >  
> > +	/* need to round it up to avoid overlap less one page */
> > +	table_start = round_up(table_start, PAGE_SIZE);
> >  	table_start >>= PAGE_SHIFT;
> >  	table_end = table_start;
> 
> thanks, applied.
> 
> 	Ingo
> 

can you use v2 instead? v2 have more comments.

Thanks

Yinghai Lu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-01-29 17:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-29  2:44 [PATCH] x86_64: fix overlap between pagetable with bss section Yinghai Lu
2008-01-29  4:05 ` [PATCH] x86_64: fix overlap between pagetable with bss section v2 Yinghai Lu
2008-01-29 17:42 ` [PATCH] x86_64: fix overlap between pagetable with bss section Ingo Molnar
2008-01-29 18:03   ` Yinghai Lu
2008-01-29 17:47     ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).