LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* guest crash on 2.6.20-rc4
@ 2007-01-08 16:18 Roland Dreier
  2007-01-09  8:53 ` [kvm-devel] " Avi Kivity
  2007-01-09 21:26 ` Avi Kivity
  0 siblings, 2 replies; 6+ messages in thread
From: Roland Dreier @ 2007-01-08 16:18 UTC (permalink / raw)
  To: kvm-devel, linux-kernel

I'm running a 64-bit Fedora 6 install as a guest on a host running
2.6.20-rc4 with the kvm-10 userspace release.  The CPU is a Xeon 5160
and I have 6 GB of RAM.  The guest is given 512 MB of memory.  I left
the guest idle overnight, and the makewhatis cron job seems to have
triggered this:

    Unable to handle kernel paging request at ffff81000ba04000 RIP:
     [<ffffffff8025f402>] clear_page+0x16/0x44
    PGD 8063 PUD 9063 PMD 800000000ba001e3 PTE aad8a7d881d984d9
    Oops: 0003 [1] SMP
    last sysfs file: /block/hda/removable
    CPU 0
    Modules linked in: autofs4 hidp nfs lockd fscache nfs_acl rfcomm l2cap bluetooth sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables dm_multipath video sbs i2c_ec i2c_core button battery asus_acpi ac ipv6 parport_pc lp parport floppy pcspkr ne2k_pci 8390 serio_raw ide_cd cdrom dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
    Pid: 4687, comm: makewhatis Not tainted 2.6.18-1.2869.fc6 #1
    RIP: 0010:[<ffffffff8025f402>]  [<ffffffff8025f402>] clear_page+0x16/0x44
    RSP: 0018:ffff810003e85c40  EFLAGS: 00010216
    RAX: 0000000000000000 RBX: ffff8100012e9140 RCX: 000000000000003f
    RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff81000ba04000
    RBP: 0000000000000001 R08: ffff81001fdc4d8e R09: 00000000000021e6
    R10: 0000000000000000 R11: 0000000000000001 R12: ffff8100012e9100
    R13: ffff81000000b500 R14: ffff81000000c400 R15: 0000000000000001
    FS:  00002aaaaaac6db0(0000) GS:ffffffff805e4000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: ffff81000ba04000 CR3: 0000000003c05000 CR4: 00000000000006e0
    Process makewhatis (pid: 4687, threadinfo ffff810003e84000, task ffff81001f3a77f0)
    Stack:  ffffffff8020a632 ffff81000000c400 0000004400000000 ffff81000000c400
     000284d000000000 0000000000000001 0000000000000001 00000000000084d0
     ffff81000000c400 00000000000084d0 ffff81000000c400 ffff81001f3a77f0
    Call Trace:
    Inexact backtrace:
     [<ffffffff8020a632>] get_page_from_freelist+0x351/0x3c4
     [<ffffffff8020f00b>] __alloc_pages+0x76/0x2c3
     [<ffffffff8022b9d3>] get_zeroed_page+0x21/0x74
     [<ffffffff8022e5be>] __pud_alloc+0x14/0x90
     [<ffffffff80208091>] copy_page_range+0x122/0x6c1
     [<ffffffff802224de>] dup_fd+0x208/0x2a8
     [<ffffffff8021f5ec>] copy_process+0xd28/0x159d
     [<ffffffff80230fd9>] do_fork+0x69/0x163
     [<ffffffff802623a4>] _spin_lock_irqsave+0x9/0xe
     [<ffffffff8025bcce>] system_call+0x7e/0x83
     [<ffffffff8025bfdb>] ptregscall_common+0x67/0xac
    
    
    Code: 48 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48
    RIP  [<ffffffff8025f402>] clear_page+0x16/0x44
     RSP <ffff810003e85c40>
    CR2: ffff81000ba04000

I just let yum update the guest to the 2.6.18-1.2869.fc6 kernel, but
I'm more suspicious of the MMU changes to kvm...

I don't see anything come up in the host logs when this happens.

Let me know if there is other debugging info that would be helpful.

 - R.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [kvm-devel] guest crash on 2.6.20-rc4
  2007-01-08 16:18 guest crash on 2.6.20-rc4 Roland Dreier
@ 2007-01-09  8:53 ` Avi Kivity
  2007-01-09 21:26 ` Avi Kivity
  1 sibling, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2007-01-09  8:53 UTC (permalink / raw)
  To: Roland Dreier; +Cc: kvm-devel, linux-kernel

Roland Dreier wrote:
> I'm running a 64-bit Fedora 6 install as a guest on a host running
> 2.6.20-rc4 with the kvm-10 userspace release.  The CPU is a Xeon 5160
> and I have 6 GB of RAM.  The guest is given 512 MB of memory.  I left
> the guest idle overnight, and the makewhatis cron job seems to have
> triggered this:
>
>     Unable to handle kernel paging request at ffff81000ba04000 RIP:
>      [<ffffffff8025f402>] clear_page+0x16/0x44
>     PGD 8063 PUD 9063 PMD 800000000ba001e3 PTE aad8a7d881d984d9
>   

The pgd/pud/pmd entries are all correct, so it's clear the mmu is confused.

> I just let yum update the guest to the 2.6.18-1.2869.fc6 kernel, but
> I'm more suspicious of the MMU changes to kvm...
>
>   

Yes.

> I don't see anything come up in the host logs when this happens.
>
> Let me know if there is other debugging info that would be helpful.
>   

A way to reproduce this would be nice, though I realize it's asking much.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [kvm-devel] guest crash on 2.6.20-rc4
  2007-01-08 16:18 guest crash on 2.6.20-rc4 Roland Dreier
  2007-01-09  8:53 ` [kvm-devel] " Avi Kivity
@ 2007-01-09 21:26 ` Avi Kivity
  2007-01-10 18:31   ` Roland Dreier
  2007-01-10 21:33   ` Roland Dreier
  1 sibling, 2 replies; 6+ messages in thread
From: Avi Kivity @ 2007-01-09 21:26 UTC (permalink / raw)
  To: Roland Dreier; +Cc: kvm-devel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 689 bytes --]

Roland Dreier wrote:
> I'm running a 64-bit Fedora 6 install as a guest on a host running
> 2.6.20-rc4 with the kvm-10 userspace release.  The CPU is a Xeon 5160
> and I have 6 GB of RAM.  The guest is given 512 MB of memory.  I left
> the guest idle overnight, and the makewhatis cron job seems to have
> triggered this:
>
>     Unable to handle kernel paging request at ffff81000ba04000 RIP:
>      [<ffffffff8025f402>] clear_page+0x16/0x44
>   

I've managed to reproduce a bug with similar characteristics: a write 
fault into a present, writable kernel page.  The attached patch should 
fix it.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


[-- Attachment #2: spurious-page-fault.patch --]
[-- Type: text/x-patch, Size: 386 bytes --]

Index: b/drivers/kvm/paging_tmpl.h
===================================================================
--- a/drivers/kvm/paging_tmpl.h	(revision 4270)
+++ b/drivers/kvm/paging_tmpl.h	(working copy)
@@ -274,7 +274,7 @@
 	struct kvm_mmu_page *page;
 
 	if (is_writeble_pte(*shadow_ent))
-		return 0;
+		return 1;
 
 	writable_shadow = *shadow_ent & PT_SHADOW_WRITABLE_MASK;
 	if (user) {

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [kvm-devel] guest crash on 2.6.20-rc4
  2007-01-09 21:26 ` Avi Kivity
@ 2007-01-10 18:31   ` Roland Dreier
  2007-01-10 21:33   ` Roland Dreier
  1 sibling, 0 replies; 6+ messages in thread
From: Roland Dreier @ 2007-01-10 18:31 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, linux-kernel

 > I've managed to reproduce a bug with similar characteristics: a write
 > fault into a present, writable kernel page.  The attached patch should
 > fix it.

Sorry for the delay in continuing this thread.  Anyway, the oops seems
to be pretty reproducible by running the makewhatis and locate db
update scripts in a loop.  I've applied your patch and kicked off a
test run;  I'll let you know if I can still get the bug to happen.

Thanks

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [kvm-devel] guest crash on 2.6.20-rc4
  2007-01-09 21:26 ` Avi Kivity
  2007-01-10 18:31   ` Roland Dreier
@ 2007-01-10 21:33   ` Roland Dreier
  2007-01-11  8:06     ` Avi Kivity
  1 sibling, 1 reply; 6+ messages in thread
From: Roland Dreier @ 2007-01-10 21:33 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, linux-kernel

 >  	if (is_writeble_pte(*shadow_ent))
 > -		return 0;
 > +		return 1;

With this patch, it looks like my guest is surviving the load that
triggered the oops before.  So I think this fixes the issue I saw as well.
I assume you'll send this in for 2.6.20?

 - R.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [kvm-devel] guest crash on 2.6.20-rc4
  2007-01-10 21:33   ` Roland Dreier
@ 2007-01-11  8:06     ` Avi Kivity
  0 siblings, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2007-01-11  8:06 UTC (permalink / raw)
  To: Roland Dreier; +Cc: kvm-devel, linux-kernel

Roland Dreier wrote:
>  >  	if (is_writeble_pte(*shadow_ent))
>  > -		return 0;
>  > +		return 1;
>
> With this patch, it looks like my guest is surviving the load that
> triggered the oops before.  So I think this fixes the issue I saw as well.
> I assume you'll send this in for 2.6.20?
>   

The patch actually replaces one bug (guest pagefaults on writable dirty 
ptes, under certain conditions) with another, rarer one (spinning on a 
user-mode pagefault on writable dirty kernel ptes).  I'll do it right 
and re-test, then send for .20 along with a few friends.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-01-11  8:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-08 16:18 guest crash on 2.6.20-rc4 Roland Dreier
2007-01-09  8:53 ` [kvm-devel] " Avi Kivity
2007-01-09 21:26 ` Avi Kivity
2007-01-10 18:31   ` Roland Dreier
2007-01-10 21:33   ` Roland Dreier
2007-01-11  8:06     ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).