LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
[not found] ` <1225994697.12607.837.camel@zakaz.uk.xensource.com>
@ 2008-11-06 19:15 ` Jeremy Fitzhardinge
2008-11-06 21:16 ` Ingo Molnar
0 siblings, 1 reply; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-06 19:15 UTC (permalink / raw)
To: Ian Campbell
Cc: Christopher S. Aker, xen devel, Jan Beulich, Ingo Molnar,
Linux Kernel Mailing List
Ian Campbell wrote:
> On Wed, 2008-11-05 at 10:31 -0500, Christopher S. Aker wrote:
>
>> 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with:
>> http://p.linode.com/1408
>>
>
> I've been seeing this too. I bisected it down to:
>
> ab00fee30cddf975200b3c97aef25bea144a0d89 is first bad commit
> commit ab00fee30cddf975200b3c97aef25bea144a0d89
> Author: Jan Beulich <jbeulich@novell.com>
> Date: Thu Oct 30 10:37:21 2008 +0000
>
> i386/PAE: fix pud_page()
>
> Impact: cleanup
>
> To the unsuspecting user it is quite annoying that this broken and
> inconsistent with x86-64 definition still exists.
>
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
>
> :040000 040000 3b49a9d3792e9f02dd799ad4deb69922d2a085d0 f0136498ef53b36172dca595f11a784f43bebcea M arch
>
> It's late so figuring out how it broke can wait for tomorrow.
>
> The interesting bit from the link given is below.
>
Ah, OK.
Ingo, Jan:
Did this patch actually fix anything, or was it just a cleanup? It
seems to have broken 32-bit Xen in some way, so if its just a cleanup it
would be best to drop it until we've worked out what's going on.
Thanks,
J
> Ian.
>
>
> 1 multicall(s) failed: cpu 0
> Pid: 1, comm: swapper Not tainted 2.6.28-rc3-test1 #1
> Call Trace:
> [<c0103e41>] xen_mc_flush+0xb1/0x180
> [<c010478f>] xen_do_pin+0x3f/0x90
> [<c0104e3f>] __xen_pgd_pin+0xcf/0x140
> [<c0104f0d>] xen_activate_mm+0x1d/0x30
> [<c018f12c>] flush_old_exec+0x29c/0x740
> [<c018e2db>] kernel_read+0x3b/0x60
> [<c01bd228>] load_elf_binary+0x198/0x16c0
> [<c0104159>] xen_set_pte+0x19/0x30
> [<c0173ef6>] handle_mm_fault+0xa46/0xc70
> [<c01719be>] vm_normal_page+0x4e/0xa0
> [<c0171cd9>] follow_page+0x2c9/0x320
> [<c0174245>] __get_user_pages+0x125/0x3e0
> [<c018e0ca>] get_arg_page+0x4a/0xb0
> [<c01bd090>] load_elf_binary+0x0/0x16c0
> [<c018fb32>] search_binary_handler+0xa2/0x230
> [<c018fe88>] do_execve+0x1c8/0x210
> [<c010685f>] sys_execve+0x2f/0x50
> [<c01085b6>] syscall_call+0x7/0xb
> [<c058007b>] sctp_setsockopt+0xd2b/0x1060
> [<c01800d8>] sys_swapon+0x308/0xaf0
> [<c010c4fc>] kernel_execve+0x1c/0x30
> [<c0102292>] init_post+0xb2/0x100
> [<c01092f3>] kernel_thread_helper+0x7/0x10
> call 1/8: op=14 arg=[d5963000] result=0
> call 2/8: op=14 arg=[d5964000] result=0
> call 3/8: op=14 arg=[d5965000] result=0
> call 4/8: op=14 arg=[d5968000] result=0
> call 5/8: op=26 arg=[c12d5880] result=0
> call 6/8: op=14 arg=[d5962000] result=0
> call 7/8: op=14 arg=[c12b2000] result=0
> call 8/8: op=26 arg=[c12d5890] result=-22
> BUG: unable to handle kernel paging request at c12b2d0c
> IP: [<c0106550>] xen_spin_unlock+0x0/0x10
> *pdpt = 00000002ccfb6027
> Oops: 0003 [#1] SMP
> last sysfs file:
> Modules linked in:
>
> Pid: 1, comm: swapper Not tainted (2.6.28-rc3-test1 #1)
> EIP: 0061:[<c0106550>] EFLAGS: 00010002 CPU: 0
> EIP is at xen_spin_unlock+0x0/0x10
> EAX: c12b2d0c EBX: 00000001 ECX: 00000000 EDX: c12d5a80
> ESI: 00000001 EDI: c12d5a80 EBP: c12d5080 ESP: d603fd38
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: e021
> Process swapper (pid: 1, ti=d603e000 task=d603d8a0 task.ti=d603e000)
> Stack:
> c0104245 c0103eb3 c062b9a4 00000008 00000008 0000001a c12d5890 ffffffea
> 00000000 00000003 00000000 00000000 d5960e40 c0104e3f 15966001 00000000
> d5962000 d5960e40 d5960e84 d603d8a0 d603dbe4 c0104f0d d5960e40 c0699d00
> Call Trace:
> [<c0104245>] xen_pte_unlock+0x5/0x10
> [<c0103eb3>] xen_mc_flush+0x123/0x180
> [<c0104e3f>] __xen_pgd_pin+0xcf/0x140
> [<c0104f0d>] xen_activate_mm+0x1d/0x30
> [<c018f12c>] flush_old_exec+0x29c/0x740
> [<c018e2db>] kernel_read+0x3b/0x60
> [<c01bd228>] load_elf_binary+0x198/0x16c0
> [<c0104159>] xen_set_pte+0x19/0x30
> [<c0173ef6>] handle_mm_fault+0xa46/0xc70
> [<c01719be>] vm_normal_page+0x4e/0xa0
> [<c0171cd9>] follow_page+0x2c9/0x320
> [<c0174245>] __get_user_pages+0x125/0x3e0
> [<c018e0ca>] get_arg_page+0x4a/0xb0
> [<c01bd090>] load_elf_binary+0x0/0x16c0
> [<c018fb32>] search_binary_handler+0xa2/0x230
> [<c018fe88>] do_execve+0x1c8/0x210
> [<c010685f>] sys_execve+0x2f/0x50
> [<c01085b6>] syscall_call+0x7/0xb
> [<c058007b>] sctp_setsockopt+0xd2b/0x1060
> [<c01800d8>] sys_swapon+0x308/0xaf0
> [<c010c4fc>] kernel_execve+0x1c/0x30
> [<c0102292>] init_post+0xb2/0x100
> [<c01092f3>] kernel_thread_helper+0x7/0x10
> Code: 6d c0 e8 d4 51 2c 00 83 f8 0f 89 c1 7f 1a 8b 04 8d 80 95 6d c0 39
> 34 03 75 e1 5b ba 03 00 00 00 89 c8 5e e9 73 c2 2d 00 5b 5e c3 <c6> 00
> 00 66 83 78 02 00 75 01 c3 eb b3 8d 76 00 0f 0b eb fe 8d
> EIP: [<c0106550>] xen_spin_unlock+0x0/0x10 SS:ESP e021:d603fd38
> ---[ end trace 72dbea1e75327c37 ]---
>
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 19:15 ` [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU Jeremy Fitzhardinge
@ 2008-11-06 21:16 ` Ingo Molnar
2008-11-06 21:20 ` Jeremy Fitzhardinge
2008-11-06 21:28 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-11-06 21:16 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Ian Campbell, Christopher S. Aker, xen devel, Jan Beulich,
Linux Kernel Mailing List
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Ian Campbell wrote:
>> On Wed, 2008-11-05 at 10:31 -0500, Christopher S. Aker wrote:
>>
>>> 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with:
>>> http://p.linode.com/1408
>>>
>>
>> I've been seeing this too. I bisected it down to:
>>
>> ab00fee30cddf975200b3c97aef25bea144a0d89 is first bad commit
>> commit ab00fee30cddf975200b3c97aef25bea144a0d89
>> Author: Jan Beulich <jbeulich@novell.com>
>> Date: Thu Oct 30 10:37:21 2008 +0000
>> i386/PAE: fix pud_page()
>> Impact: cleanup
>> To the unsuspecting user it is quite annoying
>> that this broken and
>> inconsistent with x86-64 definition still exists.
>> Signed-off-by: Jan Beulich
>> <jbeulich@novell.com>
>> Signed-off-by: Ingo Molnar <mingo@elte.hu>
>> :040000 040000 3b49a9d3792e9f02dd799ad4deb69922d2a085d0
>> f0136498ef53b36172dca595f11a784f43bebcea M arch
>>
>> It's late so figuring out how it broke can wait for tomorrow.
>>
>> The interesting bit from the link given is below.
>>
>
> Ah, OK.
>
> Ingo, Jan:
>
> Did this patch actually fix anything, or was it just a cleanup? It
> seems to have broken 32-bit Xen in some way, so if its just a cleanup it
> would be best to drop it until we've worked out what's going on.
no, it was pure cleanup. The impact line shows this:
>> Impact: cleanup
a "cleanup" impact line is only added if the change is not intended to
have any side-effects whatsoever.
We can drop it but it would be really nice to figure out what's going
on. In a very quick late-night look i cannot see anything particularly
weird about it, but based on the type of changes it does there are
three leading candidates: lost high 32 bits, zero extend problem, or
incorrect types.
Ingo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 21:16 ` Ingo Molnar
@ 2008-11-06 21:20 ` Jeremy Fitzhardinge
2008-11-06 21:22 ` Ingo Molnar
2008-11-06 21:28 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-06 21:20 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ian Campbell, Christopher S. Aker, xen devel, Jan Beulich,
Linux Kernel Mailing List
Ingo Molnar wrote:
> a "cleanup" impact line is only added if the change is not intended to
> have any side-effects whatsoever.
>
> We can drop it but it would be really nice to figure out what's going
> on. In a very quick late-night look i cannot see anything particularly
> weird about it, but based on the type of changes it does there are
> three leading candidates: lost high 32 bits, zero extend problem, or
> incorrect types.
Yeah, I couldn't see anything either. It's a reasonable cleanup (I
never did understand that struct page * cast), but its always nicer when
cleanups don't break working code ;).
J
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 21:20 ` Jeremy Fitzhardinge
@ 2008-11-06 21:22 ` Ingo Molnar
0 siblings, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-11-06 21:22 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Ian Campbell, Christopher S. Aker, xen devel, Jan Beulich,
Linux Kernel Mailing List
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Ingo Molnar wrote:
>> a "cleanup" impact line is only added if the change is not intended to
>> have any side-effects whatsoever.
>>
>> We can drop it but it would be really nice to figure out what's going
>> on. In a very quick late-night look i cannot see anything particularly
>> weird about it, but based on the type of changes it does there are
>> three leading candidates: lost high 32 bits, zero extend problem, or
>> incorrect types.
>
> Yeah, I couldn't see anything either. It's a reasonable cleanup (I
> never did understand that struct page * cast), but its always nicer
> when cleanups don't break working code ;).
Would be nice to have a look at the vmlinux delta with the patch
reverted, on the .config that breaks. By all means the object code
should be the same.
Ingo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 21:16 ` Ingo Molnar
2008-11-06 21:20 ` Jeremy Fitzhardinge
@ 2008-11-06 21:28 ` Jeremy Fitzhardinge
2008-11-06 21:33 ` Ingo Molnar
1 sibling, 1 reply; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-06 21:28 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ian Campbell, Christopher S. Aker, xen devel, Jan Beulich,
Linux Kernel Mailing List
Ingo Molnar wrote:
> a "cleanup" impact line is only added if the change is not intended to
> have any side-effects whatsoever.
>
> We can drop it but it would be really nice to figure out what's going
> on. In a very quick late-night look i cannot see anything particularly
> weird about it, but based on the type of changes it does there are
> three leading candidates: lost high 32 bits, zero extend problem, or
> incorrect types.
>
Interestingly, the Xen code appears to be the *only* user of pud_page -
and only via pgd_page in PAE mode.
J
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 21:28 ` Jeremy Fitzhardinge
@ 2008-11-06 21:33 ` Ingo Molnar
2008-11-06 21:48 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2008-11-06 21:33 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Ian Campbell, Christopher S. Aker, xen devel, Jan Beulich,
Linux Kernel Mailing List
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Ingo Molnar wrote:
>> a "cleanup" impact line is only added if the change is not intended to
>> have any side-effects whatsoever.
>>
>> We can drop it but it would be really nice to figure out what's going
>> on. In a very quick late-night look i cannot see anything particularly
>> weird about it, but based on the type of changes it does there are
>> three leading candidates: lost high 32 bits, zero extend problem, or
>> incorrect types.
>
> Interestingly, the Xen code appears to be the *only* user of
> pud_page - and only via pgd_page in PAE mode.
where exactly is that use? My grep didnt show any users of pud_page().
pud_page() was changed in an incompatible way, all users of it must be
updated.
Ingo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 21:33 ` Ingo Molnar
@ 2008-11-06 21:48 ` Jeremy Fitzhardinge
2008-11-06 22:20 ` Ingo Molnar
0 siblings, 1 reply; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-06 21:48 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ian Campbell, Christopher S. Aker, xen devel, Jan Beulich,
Linux Kernel Mailing List
Ingo Molnar wrote:
> where exactly is that use? My grep didnt show any users of pud_page().
> pud_page() was changed in an incompatible way, all users of it must be
> updated.
>
pgd_page() uses it in pgtable-nopud.h, so any users of pgd_page() also
need to be looked at. It so happens the only user is
arch/x86/xen/mmu.c, which expects it to return the vaddr. Fixed below.
J
Subject: xen: fix use of pgd_page now that it really does return a page
On 32-bit PAE, pud_page, for no good reason, didn't really return a
struct page *. Since Jan Beulich's fix "i386/PAE: fix pud_page()",
pud_page does return a struct page *.
Because PAE has 3 pagetable levels, the pud level is folded into the
pgd level, so pgd_page() is the same as pud_page(), and now returns
a struct page *. Update the xen/mmu.c code which uses pgd_page()
accordingly.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
arch/x86/xen/mmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
===================================================================
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -877,7 +877,7 @@
#else /* CONFIG_X86_32 */
#ifdef CONFIG_X86_PAE
/* Need to make sure unshared kernel PMD is pinnable */
- xen_pin_page(mm, virt_to_page(pgd_page(pgd[pgd_index(TASK_SIZE)])),
+ xen_pin_page(mm, pgd_page(pgd[pgd_index(TASK_SIZE)]),
PT_PMD);
#endif
xen_do_pin(MMUEXT_PIN_L3_TABLE, PFN_DOWN(__pa(pgd)));
@@ -994,7 +994,7 @@
#ifdef CONFIG_X86_PAE
/* Need to make sure unshared kernel PMD is unpinned */
- xen_unpin_page(mm, virt_to_page(pgd_page(pgd[pgd_index(TASK_SIZE)])),
+ xen_unpin_page(mm, pgd_page(pgd[pgd_index(TASK_SIZE)]),
PT_PMD);
#endif
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 21:48 ` Jeremy Fitzhardinge
@ 2008-11-06 22:20 ` Ingo Molnar
2008-11-06 22:29 ` Jeremy Fitzhardinge
2008-11-07 9:53 ` Jan Beulich
0 siblings, 2 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-11-06 22:20 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Ian Campbell, Christopher S. Aker, xen devel, Jan Beulich,
Linux Kernel Mailing List
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Ingo Molnar wrote:
>> where exactly is that use? My grep didnt show any users of pud_page().
>> pud_page() was changed in an incompatible way, all users of it must be
>> updated.
>>
>
> pgd_page() uses it in pgtable-nopud.h, so any users of pgd_page()
> also need to be looked at. It so happens the only user is
> arch/x86/xen/mmu.c, which expects it to return the vaddr. Fixed
> below.
ah! asm-generic was missed by my grep. (and i suspect Jan missed it
too)
> Subject: xen: fix use of pgd_page now that it really does return a page
applied to tip/x86/urgent, thanks!
Ingo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 22:20 ` Ingo Molnar
@ 2008-11-06 22:29 ` Jeremy Fitzhardinge
2008-11-07 9:53 ` Jan Beulich
1 sibling, 0 replies; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-06 22:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ian Campbell, Christopher S. Aker, xen devel, Jan Beulich,
Linux Kernel Mailing List
Ingo Molnar wrote:
> ah! asm-generic was missed by my grep. (and i suspect Jan missed it
> too)
>
cscope is your friend.
>> Subject: xen: fix use of pgd_page now that it really does return a page
>>
>
> applied to tip/x86/urgent, thanks!
>
Thanks,
J
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
2008-11-06 22:20 ` Ingo Molnar
2008-11-06 22:29 ` Jeremy Fitzhardinge
@ 2008-11-07 9:53 ` Jan Beulich
1 sibling, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2008-11-07 9:53 UTC (permalink / raw)
To: Ingo Molnar, Jeremy Fitzhardinge
Cc: Ian Campbell, xen devel, Christopher S. Aker, Linux Kernel Mailing List
>>> Ingo Molnar <mingo@elte.hu> 06.11.08 23:20 >>>
>
>* Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>
>> Ingo Molnar wrote:
>>> where exactly is that use? My grep didnt show any users of pud_page().
>>> pud_page() was changed in an incompatible way, all users of it must be
>>> updated.
>>>
>>
>> pgd_page() uses it in pgtable-nopud.h, so any users of pgd_page()
>> also need to be looked at. It so happens the only user is
>> arch/x86/xen/mmu.c, which expects it to return the vaddr. Fixed
>> below.
>
>ah! asm-generic was missed by my grep. (and i suspect Jan missed it
>too)
Indeed - broken as it was I never even considered this could be used
somewhere in generic code.
>> Subject: xen: fix use of pgd_page now that it really does return a page
>
>applied to tip/x86/urgent, thanks!
And my thanks, too, Jeremy, for the quick spotting of the problem.
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-11-07 9:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <48F3BC99.5040409@theshore.net>
[not found] ` <490F5F50.7020704@theshore.net>
[not found] ` <4910AD00.7040605@theshore.net>
[not found] ` <4911BC52.7040905@theshore.net>
[not found] ` <1225994697.12607.837.camel@zakaz.uk.xensource.com>
2008-11-06 19:15 ` [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU Jeremy Fitzhardinge
2008-11-06 21:16 ` Ingo Molnar
2008-11-06 21:20 ` Jeremy Fitzhardinge
2008-11-06 21:22 ` Ingo Molnar
2008-11-06 21:28 ` Jeremy Fitzhardinge
2008-11-06 21:33 ` Ingo Molnar
2008-11-06 21:48 ` Jeremy Fitzhardinge
2008-11-06 22:20 ` Ingo Molnar
2008-11-06 22:29 ` Jeremy Fitzhardinge
2008-11-07 9:53 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).