LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* system without RAM on node0 boot fail
@ 2008-01-31  4:26 Yinghai Lu
  2008-01-31  5:02 ` Yinghai Lu
  0 siblings, 1 reply; 9+ messages in thread
From: Yinghai Lu @ 2008-01-31  4:26 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel


one system not install RAM on node0 got

SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
SRAT: Node 1 PXM 1 0-a0000
SRAT: Node 1 PXM 1 0-dd000000
SRAT: Node 1 PXM 1 0-123000000
Bootmem setup node 1 0000000000000000-0000000123000000
  NODE_DATA [000000000000e000 - 0000000000014fff]
  bootmap [0000000000015000 -  00000000000395ff] pages 25
early res: 0 [0-fff] BIOS data page
early res: 1 [6000-7fff] SMP_TRAMPOLINE
early res: 2 [200000-d9c273] TEXT DATA BSS
early res: 3 [7e700000-7ffffa25] RAMDISK
early res: 4 [9bc00-9dbff] EBDA
early res: 5 [8000-dfff] PGTABLE
Could not find start_pfn for node 0
Pid: 0, comm: swapper Not tainted 2.6.24-smp-04921-gbce08dc-dirty #42

Call Trace:
 [<ffffffff80c200d6>] free_area_init_node+0x22/0x381
 [<ffffffff804539e5>] generic_swap+0x0/0x17
 [<ffffffff80c06a1b>] find_zone_movable_pfns_for_nodes+0x54/0x271
 [<ffffffff80c06f4a>] free_area_init_nodes+0x239/0x287
 [<ffffffff80c0203a>] paging_init+0x46/0x4c
 [<ffffffff80bf9dba>] setup_arch+0x3c3/0x44e
 [<ffffffff80bf38d3>] start_kernel+0x6f/0x2c7
 [<ffffffff80bf31e1>] _sinittext+0x1e1/0x1e8

RIP 0x10

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: system without RAM on node0 boot fail
  2008-01-31  5:02 ` Yinghai Lu
@ 2008-01-31  4:51   ` Christoph Lameter
  2008-01-31  6:09     ` H. Peter Anvin
  2008-02-01 18:11     ` dean gaudet
  0 siblings, 2 replies; 9+ messages in thread
From: Christoph Lameter @ 2008-01-31  4:51 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Ingo Molnar, linux-kernel

x86 supports booting from a node without RAM?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: system without RAM on node0 boot fail
  2008-01-31  4:26 system without RAM on node0 boot fail Yinghai Lu
@ 2008-01-31  5:02 ` Yinghai Lu
  2008-01-31  4:51   ` Christoph Lameter
  0 siblings, 1 reply; 9+ messages in thread
From: Yinghai Lu @ 2008-01-31  5:02 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Christoph Lameter

current x86.git
Command line: apic=debug acpi.debug_level=0x0000000F debug initcall_debug pci=routeirq ramdisk_size=131072 root=/dev/ram0 rw ip=dhcp console=uart8250,io,0x3f8,115200n8
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000100 - 000000000009bc00 (usable)
 BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000dcff0000 (usable)
 BIOS-e820: 00000000dcff0000 - 00000000dcffe000 (ACPI data)
 BIOS-e820: 00000000dcffe000 - 00000000dd000000 (ACPI NVS)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000123000000 (usable)
Early serial console at I/O port 0x3f8 (options '115200n8')
console [uart0] enabled
end_pfn_map = 1191936
DMI present.
...
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
SRAT: Node 1 PXM 1 0-a0000
SRAT: Node 1 PXM 1 0-dd000000
SRAT: Node 1 PXM 1 0-123000000
ACPI: SLIT: nodes = 2
 10 13
 13 10
mapped APIC to ffffffffff5fb000 (        fee00000)
Bootmem setup node 1 0000000000000000-0000000123000000
  NODE_DATA [000000000000e000 - 0000000000014fff]
  bootmap [0000000000015000 -  00000000000395ff] pages 25
early res: 0 [0-fff] BIOS data page
early res: 1 [6000-7fff] SMP_TRAMPOLINE
early res: 2 [200000-d9c273] TEXT DATA BSS
early res: 3 [7e6f4000-7fff3a25] RAMDISK
early res: 4 [9bc00-9dbff] EBDA
early res: 5 [8000-dfff] PGTABLE
Could not find start_pfn for node 0
Pid: 0, comm: swapper Not tainted 2.6.24-smp-04921-gbce08dc-dirty #43

Call Trace:
 [<ffffffff80c200d6>] free_area_init_node+0x22/0x381
 [<ffffffff804539e5>] generic_swap+0x0/0x17
 [<ffffffff80c06a1b>] find_zone_movable_pfns_for_nodes+0x54/0x271
 [<ffffffff80c06f4a>] free_area_init_nodes+0x239/0x287
 [<ffffffff80c0203a>] paging_init+0x46/0x4c
 [<ffffffff80bf9dba>] setup_arch+0x3c3/0x44e
 [<ffffffff80bf38d3>] start_kernel+0x6f/0x2c7
 [<ffffffff80bf31e1>] _sinittext+0x1e1/0x1e8

RIP 0x10


2.6.24 
discontinuous and slab works well

2.6.24
sparse and slub will get oops
ehci_hcd 0000:00:02.1: EHCI Host Controller
Unable to handle kernel paging request at 0000000000003078 RIP:
 [<ffffffff8026585f>] __alloc_pages+0x7d/0x33a
PGD 0
Oops: 0000 [1] SMP
CPU 3
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.24-smp #1
RIP: 0010:[<ffffffff8026585f>]  [<ffffffff8026585f>] __alloc_pages+0x7d/0x33a
RSP: 0018:ffff810122a55bc0  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
RDX: 0000000000003070 RSI: 0000000000000000 RDI: 00000000000000d0
RBP: 00000000000000d0 R08: ffff810001025be0 R09: ffff810122a55b00
R10: ffffffff8085e2a0 R11: 00000000000000a0 R12: 0000000000003070
R13: 0000000000000000 R14: 00000000000000d0 R15: ffff810122a52000
FS:  0000000000000000(0000) GS:ffff810122c02f00(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000003078 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff810122a54000, task ffff810122a52000)
Stack:  ffff810122a55d20 0000000080455b55 0000000000000000 0000000000000080
 0000000000003078 0000000022a55d10 00000010000200d0 ffff81011f5ea000
 0000000000000020 0000000000000000 00000000fffffff2 0000000000000000
Call Trace:
 [<ffffffff80282ea1>] new_slab+0xdd/0x236
 [<ffffffff802831a1>] __slab_alloc+0x1a7/0x397
 [<ffffffff804cea6f>] dma_pool_create+0x86/0x147
 [<ffffffff80283697>] kmem_cache_alloc_node+0x3e/0x6d
 [<ffffffff804cea6f>] dma_pool_create+0x86/0x147
 [<ffffffff80667e52>] hcd_buffer_create+0x57/0x89
 [<ffffffff80450032>] compat_blkdev_ioctl+0xd72/0x11f4
 [<ffffffff80662da4>] usb_add_hcd+0x72/0x59f
 [<ffffffff8066c541>] usb_hcd_pci_probe+0x1e4/0x28b
 [<ffffffff80461de6>] pci_device_probe+0xd1/0x136
 [<ffffffff804caf0f>] driver_probe_device+0xd3/0x150
 [<ffffffff804cb02e>] __driver_attach+0x0/0x93
 [<ffffffff804cb088>] __driver_attach+0x5a/0x93
 [<ffffffff804ca36d>] bus_for_each_dev+0x43/0x6e
 [<ffffffff804ca69d>] bus_add_driver+0x79/0x1bd
 [<ffffffff80461fbe>] __pci_register_driver+0x5b/0x8d
 [<ffffffff80bbf63a>] kernel_init+0x175/0x2e1
 [<ffffffff8020cd28>] child_rip+0xa/0x12
 [<ffffffff80bbf4c5>] kernel_init+0x0/0x2e1
 [<ffffffff8020cd1e>] child_rip+0x0/0x12


Code: 49 83 7c 24 08 00 75 0e 48 c7 44 24 38 00 00 00 00 e9 93 02
RIP  [<ffffffff8026585f>] __alloc_pages+0x7d/0x33a
 RSP <ffff810122a55bc0>
CR2: 0000000000003078
---[ end trace c08baa60a7f2ad32 ]---
Kernel panic - not syncing: Attempted to kill init!


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: system without RAM on node0 boot fail
  2008-01-31  4:51   ` Christoph Lameter
@ 2008-01-31  6:09     ` H. Peter Anvin
  2008-01-31  7:06       ` Yinghai Lu
  2008-02-01 18:11     ` dean gaudet
  1 sibling, 1 reply; 9+ messages in thread
From: H. Peter Anvin @ 2008-01-31  6:09 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Yinghai Lu, Ingo Molnar, linux-kernel

Christoph Lameter wrote:
> x86 supports booting from a node without RAM?

 From the looks of it I would say he probably has the boot node numbered 1.

The e820 map is also "interesting" - doesn't list the first 256 bytes, 
which corresponds to the first quarter(!) of the real-mode exception table.

	-hpa

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: system without RAM on node0 boot fail
  2008-01-31  6:09     ` H. Peter Anvin
@ 2008-01-31  7:06       ` Yinghai Lu
  2008-01-31  7:22         ` H. Peter Anvin
  0 siblings, 1 reply; 9+ messages in thread
From: Yinghai Lu @ 2008-01-31  7:06 UTC (permalink / raw)
  To: H. Peter Anvin, Eric W. Biederman, Andi Kleen
  Cc: Christoph Lameter, Ingo Molnar, linux-kernel

On Jan 30, 2008 10:09 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> Christoph Lameter wrote:
> > x86 supports booting from a node without RAM?

it is a two sockets system. only 4G RAM installed on node1.

>
>  From the looks of it I would say he probably has the boot node numbered 1.
>
> The e820 map is also "interesting" - doesn't list the first 256 bytes,
> which corresponds to the first quarter(!) of the real-mode exception table.

i kexec that from 2.6.24 (with discontinuous and slab)
so that e820 is passed by kexec from first kernel. normal pxeboot will
have start for 0.

YH

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: system without RAM on node0 boot fail
  2008-01-31  7:06       ` Yinghai Lu
@ 2008-01-31  7:22         ` H. Peter Anvin
  2008-01-31  7:39           ` Andi Kleen
  0 siblings, 1 reply; 9+ messages in thread
From: H. Peter Anvin @ 2008-01-31  7:22 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Eric W. Biederman, Andi Kleen, Christoph Lameter, Ingo Molnar,
	linux-kernel

Yinghai Lu wrote:
> On Jan 30, 2008 10:09 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> Christoph Lameter wrote:
>>> x86 supports booting from a node without RAM?
> 
> it is a two sockets system. only 4G RAM installed on node1.
> 

"Node 1" is the boot CPU, though, right?

I don't know if the spec requires node 0 to be the boot node.  Probably not.

	-hpa


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: system without RAM on node0 boot fail
  2008-01-31  7:22         ` H. Peter Anvin
@ 2008-01-31  7:39           ` Andi Kleen
  0 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2008-01-31  7:39 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Yinghai Lu, Eric W. Biederman, Christoph Lameter, Ingo Molnar,
	linux-kernel

On Thursday 31 January 2008 08:22:15 H. Peter Anvin wrote:
> Yinghai Lu wrote:
> > On Jan 30, 2008 10:09 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> >> Christoph Lameter wrote:
> >>> x86 supports booting from a node without RAM?
> > 
> > it is a two sockets system. only 4G RAM installed on node1.
> > 
> 
> "Node 1" is the boot CPU, though, right?
> 
> I don't know if the spec requires node 0 to be the boot node.  Probably not.

There is no spec I know of that completely defines "nodes" on x86.
Actually I think on Linux calls them that.

There is the ACPI 3.0 SRAT spec that defines memory affinity,
but I don't think it has any requirements about where the memory must be.

Even if there was a spec people who actually put in DIMMs tend
to violate it. It seems to be not totally uncommon to just 
stuff them all into the same corner of the motherboard to give 
a "tidy appearance" (for non physicists :-) and that usually results in 
memory less nodes.

Anyways this area is something that regresses regularly. I had
fixed it several times and tested all cases on SimNow, but after
some time it tends to bit rot again unfortunately. The people who
usually test kernels probably know where to put the DIMMs in.

Probably just happened again.

-Andi



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: system without RAM on node0 boot fail
  2008-01-31  4:51   ` Christoph Lameter
  2008-01-31  6:09     ` H. Peter Anvin
@ 2008-02-01 18:11     ` dean gaudet
  2008-02-01 19:15       ` Yinghai Lu
  1 sibling, 1 reply; 9+ messages in thread
From: dean gaudet @ 2008-02-01 18:11 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Yinghai Lu, Ingo Molnar, linux-kernel

actually yeah i've seen this... in a bizarre failure situation in a system 
which physically had RAM in the boot node but it was never enumerated for 
the kernel (other nodes had RAM which was enumerated).

so technically there was boot node RAM but the kernel never saw it.

-dean

On Wed, 30 Jan 2008, Christoph Lameter wrote:

> x86 supports booting from a node without RAM?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: system without RAM on node0 boot fail
  2008-02-01 18:11     ` dean gaudet
@ 2008-02-01 19:15       ` Yinghai Lu
  0 siblings, 0 replies; 9+ messages in thread
From: Yinghai Lu @ 2008-02-01 19:15 UTC (permalink / raw)
  To: dean gaudet; +Cc: Christoph Lameter, Ingo Molnar, linux-kernel

On Feb 1, 2008 10:11 AM, dean gaudet <dean@arctic.org> wrote:
> actually yeah i've seen this... in a bizarre failure situation in a system
> which physically had RAM in the boot node but it was never enumerated for
> the kernel (other nodes had RAM which was enumerated).
>
> so technically there was boot node RAM but the kernel never saw it.

BIOS sometime disabled some dimms on node that it thought that dimm
was bad and caused mce error in last boot.

YH

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-02-01 19:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-31  4:26 system without RAM on node0 boot fail Yinghai Lu
2008-01-31  5:02 ` Yinghai Lu
2008-01-31  4:51   ` Christoph Lameter
2008-01-31  6:09     ` H. Peter Anvin
2008-01-31  7:06       ` Yinghai Lu
2008-01-31  7:22         ` H. Peter Anvin
2008-01-31  7:39           ` Andi Kleen
2008-02-01 18:11     ` dean gaudet
2008-02-01 19:15       ` Yinghai Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).