LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: ACPI early ioremap problems
       [not found]   ` <20080119152649.GA1375@one.firstfloor.org>
@ 2008-01-19 15:30     ` Ingo Molnar
  2008-01-19 15:46       ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2008-01-19 15:30 UTC (permalink / raw)
  To: Andi Kleen; +Cc: venkatesh.pallipadi, suresh.b.siddha, tglx, linux-kernel


(Cc:-ing to lkml, because this might interest others too)

* Andi Kleen <andi@firstfloor.org> wrote:

> > would be interesting to figure out what's going on here - i doubt 
> > it's an ACPI bug, because then we'd be getting a hard page fault, 
> > right? In
> 
> ACPI hasn't changed at all in git-x86 and it worked fine before. So I 
> don't really blame it.

yes, it did not change, but there are latent ACPI bugs/uncleanlinesses 
where it references an already unmapped table. This worked by chance 
until now, because we didnt actually unmap any tables - but now we 
explicitly map/unmap the tables via the MMU. But, i dont think that's 
the cause of the failure here - those bugs typically show up in other 
ways.

> > that case it's a 64-bit early_ioremap() bug that we want to find 
> > even if ACPI didnt use early_ioremap().
> > 
> > and this all runs before zap_low_mappings(), right?
> 
> No after. Since some time x86-64 does the equivalent of z_l_m() in 
> head64(); this means before start_kernel and definitely before 
> setup_arch which sets up ACPI.

that would mean early_ioremap() should switch to ioremap() after that 
point. Could you try that, does it resolve the failure you are seeing? 

Long-term we want to have a single, uniform ioremap() interface (on 
32-bit and 64-bit x86 as well) that can be used anytime, which just 
switches to the right lowlevel method depending on how far we are into 
the pagetable and memory subsystem bootstrap - instead of these more 
fragile "can we now use early_ioremap() or should we already be using 
ioremap()" usages.

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ACPI early ioremap problems
  2008-01-19 15:30     ` ACPI early ioremap problems Ingo Molnar
@ 2008-01-19 15:46       ` Andi Kleen
  2008-01-19 18:45         ` Ingo Molnar
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2008-01-19 15:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, venkatesh.pallipadi, suresh.b.siddha, tglx, linux-kernel

On Sat, Jan 19, 2008 at 04:30:55PM +0100, Ingo Molnar wrote:
> > > that case it's a 64-bit early_ioremap() bug that we want to find 
> > > even if ACPI didnt use early_ioremap().
> > > 
> > > and this all runs before zap_low_mappings(), right?
> > 
> > No after. Since some time x86-64 does the equivalent of z_l_m() in 
> > head64(); this means before start_kernel and definitely before 
> > setup_arch which sets up ACPI.
> 
> that would mean early_ioremap() should switch to ioremap() after that 
> point. Could you try that, does it resolve the failure you are seeing? 

ioremap() does alloc_page and that won't work before 
paging_init(). Early ACPI scan is before paging_init() because
paging_init() needs node discovery at at least; which requires
some ACPI tables.

> Long-term we want to have a single, uniform ioremap() interface (on 
> 32-bit and 64-bit x86 as well) that can be used anytime, which just 
> switches to the right lowlevel method depending on how far we are into 
> the pagetable and memory subsystem bootstrap - instead of these more 
> fragile "can we now use early_ioremap() or should we already be using 
> ioremap()" usages.

I didn't think there were enough early/bt_ioremap() users for this
to be really worthwhile. The only code that does both I'm aware of
is the memory setup code.

-Andi


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ACPI early ioremap problems
  2008-01-19 15:46       ` Andi Kleen
@ 2008-01-19 18:45         ` Ingo Molnar
  2008-01-19 19:16           ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2008-01-19 18:45 UTC (permalink / raw)
  To: Andi Kleen; +Cc: venkatesh.pallipadi, suresh.b.siddha, tglx, linux-kernel


* Andi Kleen <andi@firstfloor.org> wrote:

> On Sat, Jan 19, 2008 at 04:30:55PM +0100, Ingo Molnar wrote:
> > > > that case it's a 64-bit early_ioremap() bug that we want to find 
> > > > even if ACPI didnt use early_ioremap().
> > > > 
> > > > and this all runs before zap_low_mappings(), right?
> > > 
> > > No after. Since some time x86-64 does the equivalent of z_l_m() in 
> > > head64(); this means before start_kernel and definitely before 
> > > setup_arch which sets up ACPI.
> > 
> > that would mean early_ioremap() should switch to ioremap() after that 
> > point. Could you try that, does it resolve the failure you are seeing? 
> 
> ioremap() does alloc_page and that won't work before paging_init(). 
> Early ACPI scan is before paging_init() because paging_init() needs 
> node discovery at at least; which requires some ACPI tables.

hm, so are you saying that on 64-bit there's in essence no usable 
ioremap facility between zap_low_mappings() and paging_init()? 
(early_ioremap() is not usable anymore, and ioremap() is not yet 
usable.) I guess we'll have to pick up the 32-bit early_ioremap() code 
for 64-bit as well.

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ACPI early ioremap problems
  2008-01-19 18:45         ` Ingo Molnar
@ 2008-01-19 19:16           ` Andi Kleen
  2008-01-20 16:48             ` ACPI early ioremap problems II Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2008-01-19 19:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, venkatesh.pallipadi, suresh.b.siddha, tglx, linux-kernel

> hm, so are you saying that on 64-bit there's in essence no usable 
> ioremap facility between zap_low_mappings() and paging_init()? 
> (early_ioremap() is not usable anymore, and ioremap() is not yet 
> usable.) I guess we'll have to pick up the 32-bit early_ioremap() code 
> for 64-bit as well.

No early_ioremap() should work. Or rather used to work. 

It does map into the kernel mapping which is not zapped. It just broke 
recently.

But it definitely used to work because several users used it before
paging_init() even before the recent PAT changes.

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ACPI early ioremap problems II
  2008-01-19 19:16           ` Andi Kleen
@ 2008-01-20 16:48             ` Andi Kleen
  2008-01-20 23:50               ` Ingo Molnar
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2008-01-20 16:48 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, venkatesh.pallipadi, suresh.b.siddha, tglx, linux-kernel


As a followup I see this problem on three different 64bit machines now.
Symptom is usually that only one core is active because ACPI doesn't
see the other processors in its tables.

-Andi


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ACPI early ioremap problems II
  2008-01-20 16:48             ` ACPI early ioremap problems II Andi Kleen
@ 2008-01-20 23:50               ` Ingo Molnar
  2008-01-21  2:59                 ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2008-01-20 23:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: venkatesh.pallipadi, suresh.b.siddha, tglx, linux-kernel


* Andi Kleen <andi@firstfloor.org> wrote:

> As a followup I see this problem on three different 64bit machines 
> now. Symptom is usually that only one core is active because ACPI 
> doesn't see the other processors in its tables.

to be able to have a chance to fix it we need you meet the minimum 
threshold for bugreports: please send the failing .config and a full 
boot message of the incident as well. (Please also check latest x86.git, 
maybe it's something that got fixed today.)

I have tested latest x86.git#mm on 3 different 64bit machines and Thomas 
tested it too, and none of us saw any crashes. So whatever you are 
seeing, it's unfortunately not very common to trigger.

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ACPI early ioremap problems II
  2008-01-20 23:50               ` Ingo Molnar
@ 2008-01-21  2:59                 ` Andi Kleen
  0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2008-01-21  2:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, venkatesh.pallipadi, suresh.b.siddha, tglx, linux-kernel

On Mon, Jan 21, 2008 at 12:50:00AM +0100, Ingo Molnar wrote:
> 
> * Andi Kleen <andi@firstfloor.org> wrote:
> 
> > As a followup I see this problem on three different 64bit machines 
> > now. Symptom is usually that only one core is active because ACPI 
> > doesn't see the other processors in its tables.
> 
> to be able to have a chance to fix it we need you meet the minimum 
> threshold for bugreports: please send the failing .config and a full 
> boot message of the incident as well. (Please also check latest x86.git, 
> maybe it's something that got fixed today.)

The other crash seems to have gone away on git tip 
28a0fcd6b38e247200bd857996375aee91eae8ce now. I also don't see the missing 
nodes or missing cores anymore. Also the extended memory on AMD
seems to be detected correctly gain.

Thanks,

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-01-21  2:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20080119030843.GA2028@basil.nowhere.org>
     [not found] ` <20080119150328.GA32370@elte.hu>
     [not found]   ` <20080119152649.GA1375@one.firstfloor.org>
2008-01-19 15:30     ` ACPI early ioremap problems Ingo Molnar
2008-01-19 15:46       ` Andi Kleen
2008-01-19 18:45         ` Ingo Molnar
2008-01-19 19:16           ` Andi Kleen
2008-01-20 16:48             ` ACPI early ioremap problems II Andi Kleen
2008-01-20 23:50               ` Ingo Molnar
2008-01-21  2:59                 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).