LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Cc: nfsv4@linux-nfs.org, linux-kernel@vger.kernel.org,
	Neil Brown <neilb@suse.de>
Subject: Re: Oops in NFSv4 server in 2.6.23.17
Date: Wed, 12 Mar 2008 12:00:07 -0400	[thread overview]
Message-ID: <20080312160007.GA10015@fieldses.org> (raw)
In-Reply-To: <20080312122550.GB8141@ics.muni.cz>

On Wed, Mar 12, 2008 at 01:25:50PM +0100, Lukas Hejtmanek wrote:
> Hello,
> 
> I'm running NFSv4 server (nfs-utils 1.1.1) with vanila kernel 2.6.23.17.
> It works while I use the server from standard ethernet lines.
> 
> However, I tried to mount the volume over GPRS network. For the first time, it
> replies premission denied and the second try results in the following oops.
> 
> Anything I should try?
> 
> I guess that the last oops is induced from the first one due to some memory 
> corruption.
> 
> WARNING: at lib/kref.c:33 kref_get()

Hm.  So looks like that means kref_get() was called on something with a
zero reference count.

> 
> Call Trace:
>  [<ffffffff803a1315>] kref_get+0x35/0x40
>  [<ffffffff88128f34>] :sunrpc:sunrpc_cache_lookup+0x64/0x160

So sunrpc_cache_lookup found something with a zero reference count,
in...

>  [<ffffffff881b5910>] :nfsd:svc_export_lookup+0xc0/0xd0

... the has table that keeps the kernel's cached view of the export
table.

>  [<ffffffff881b594f>] :nfsd:exp_get_by_name+0x2f/0x80
>  [<ffffffff88128f34>] :sunrpc:sunrpc_cache_lookup+0x64/0x160
>  [<ffffffff88127b89>] :sunrpc:cache_check+0x49/0x490
>  [<ffffffff881b5fb7>] :nfsd:exp_find_key+0x57/0xc0
>  [<ffffffff881b6064>] :nfsd:exp_find+0x44/0xa0
>  [<ffffffff881b615e>] :nfsd:rqst_exp_find+0x9e/0xf0
>  [<ffffffff881b0c45>] :nfsd:fh_verify+0x425/0x540
>  [<ffffffff881bf25e>] :nfsd:nfsd4_proc_compound+0x1de/0x330
>  [<ffffffff881ad271>] :nfsd:nfsd_dispatch+0xb1/0x250
>  [<ffffffff8811fe91>] :sunrpc:svc_process+0x491/0x7d0
>  [<ffffffff804e8032>] __down_read+0x12/0xb0
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff881ad9b0>] :nfsd:nfsd+0x1a0/0x2f0
>  [<ffffffff8020d068>] child_rip+0xa/0x12
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff8020d05e>] child_rip+0x0/0x12
> 
> WARNING: at lib/kref.c:33 kref_get()
> 
> Call Trace:
>  [<ffffffff803a1315>] kref_get+0x35/0x40
>  [<ffffffff88128f34>] :sunrpc:sunrpc_cache_lookup+0x64/0x160
>  [<ffffffff881b5910>] :nfsd:svc_export_lookup+0xc0/0xd0
>  [<ffffffff881b594f>] :nfsd:exp_get_by_name+0x2f/0x80
>  [<ffffffff88128f34>] :sunrpc:sunrpc_cache_lookup+0x64/0x160
>  [<ffffffff88127b89>] :sunrpc:cache_check+0x49/0x490
>  [<ffffffff881b5fb7>] :nfsd:exp_find_key+0x57/0xc0
>  [<ffffffff881b6064>] :nfsd:exp_find+0x44/0xa0
>  [<ffffffff881b615e>] :nfsd:rqst_exp_find+0x9e/0xf0
>  [<ffffffff881b0c45>] :nfsd:fh_verify+0x425/0x540
>  [<ffffffff881bf25e>] :nfsd:nfsd4_proc_compound+0x1de/0x330
>  [<ffffffff881ad271>] :nfsd:nfsd_dispatch+0xb1/0x250
>  [<ffffffff8811fe91>] :sunrpc:svc_process+0x491/0x7d0
>  [<ffffffff804e8032>] __down_read+0x12/0xb0
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff881ad9b0>] :nfsd:nfsd+0x1a0/0x2f0
>  [<ffffffff8020d068>] child_rip+0xa/0x12
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff8020d05e>] child_rip+0x0/0x12
> 
> WARNING: at lib/kref.c:33 kref_get()
> 
> Call Trace:
>  [<ffffffff803a1315>] kref_get+0x35/0x40
>  [<ffffffff88128f34>] :sunrpc:sunrpc_cache_lookup+0x64/0x160
>  [<ffffffff881b5910>] :nfsd:svc_export_lookup+0xc0/0xd0
>  [<ffffffff881b594f>] :nfsd:exp_get_by_name+0x2f/0x80
>  [<ffffffff88128f34>] :sunrpc:sunrpc_cache_lookup+0x64/0x160
>  [<ffffffff88127b89>] :sunrpc:cache_check+0x49/0x490
>  [<ffffffff881b5fb7>] :nfsd:exp_find_key+0x57/0xc0
>  [<ffffffff881b6064>] :nfsd:exp_find+0x44/0xa0
>  [<ffffffff881b615e>] :nfsd:rqst_exp_find+0x9e/0xf0
>  [<ffffffff881b0c45>] :nfsd:fh_verify+0x425/0x540
>  [<ffffffff881bf25e>] :nfsd:nfsd4_proc_compound+0x1de/0x330
>  [<ffffffff881ad271>] :nfsd:nfsd_dispatch+0xb1/0x250
>  [<ffffffff8811fe91>] :sunrpc:svc_process+0x491/0x7d0
>  [<ffffffff804e8032>] __down_read+0x12/0xb0
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff881ad9b0>] :nfsd:nfsd+0x1a0/0x2f0
>  [<ffffffff8020d068>] child_rip+0xa/0x12
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff8020d05e>] child_rip+0x0/0x12
> 
> general protection fault: 0000 [1] SMP 
> CPU 0 
> Modules linked in: des cbc blkcipher nfsd exportfs rpcsec_gss_krb5 auth_rpcgss nfs lockd nfs_acl sunrpc ipv6 dm_snapshot dm_mirror loop dm_round_robin dm_multipath dm_mod ata_piix ata_generic thermal mptfc libata processor scsi_transport_fc ehci_hcd e1000 evdev serio_raw psmouse uhci_hcd i2c_i801 i2c_core
> Pid: 2863, comm: nfsd Not tainted 2.6.23.17 #1
> RIP: 0010:[<ffffffff8819b113>]  [<ffffffff8819b113>] :auth_rpcgss:gss_get_mic+0x3/0x10
> RSP: 0018:ffff81025c1d1c78  EFLAGS: 00010286
> RAX: 762aba3d00000004 RBX: ffff81025b19f240 RCX: ffff81025c05c000
> RDX: ffff81025c1d1cd0 RSI: ffff81025c1d1c80 RDI: ffff81025effa600
> RBP: ffff81025c05c014 R08: 0000000000000000 R09: 0000000000000004
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff81025effa600
> R13: ffff81025b596000 R14: ffff81025c05c018 R15: ffff81025b596130
> FS:  00002b540c07b6d0(0000) GS:ffffffff805fe000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007fffa57e00e8 CR3: 000000025bb40000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process nfsd (pid: 2863, threadinfo ffff81025c1d0000, task ffff81025fe0c0c0)
> Stack:  ffffffff8819c175 ffff81025c1d1cec 0000000000000004 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000400000004
>  0000000000000000 ffff81025c1d1cec 0000000000000004 ffff81025effa600
> Call Trace:
>  [<ffffffff8819c175>] :auth_rpcgss:gss_write_verf+0xa5/0x130

In this case it looks like something in the gss context cache is bad.

I'm not sure what's going on.

>From a quick

	gitk v2.6.23..origin fs/nfsd net/sunrpc/cache.c include/linux/sunrpc/cache.h

I don't see any recent commits that would address a problem like this.
But it might be worth trying to reproduce this in v2.6.25-rc5.

It'd also be interesting to know whether this is reproduceable with
auth_unix (as opposed to auth_gss) access, and/or with nfsv3 as opposed
to nfsv4.

--b.


>  [<ffffffff8819d408>] :auth_rpcgss:svcauth_gss_accept+0x8b8/0xc30
>  [<ffffffff88120c7f>] :sunrpc:svc_sock_enqueue+0x19f/0x340
>  [<ffffffff88122531>] :sunrpc:svc_tcp_recvfrom+0x81/0x980
>  [<ffffffff80242250>] del_timer_sync+0x10/0x20
>  [<ffffffff804e7057>] schedule_timeout+0x67/0xd0
>  [<ffffffff80242060>] process_timeout+0x0/0x10
>  [<ffffffff804e704a>] schedule_timeout+0x5a/0xd0
>  [<ffffffff8811fd78>] :sunrpc:svc_process+0x378/0x7d0
>  [<ffffffff80231d60>] default_wake_function+0x0/0x10
>  [<ffffffff804e8032>] __down_read+0x12/0xb0
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff881ad9b0>] :nfsd:nfsd+0x1a0/0x2f0
>  [<ffffffff8020d068>] child_rip+0xa/0x12
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff881ad810>] :nfsd:nfsd+0x0/0x2f0
>  [<ffffffff8020d05e>] child_rip+0x0/0x12
> 
> 
> Code: 48 8b 40 30 4c 8b 58 08 41 ff e3 66 90 48 8b 07 48 8b 40 30 
> RIP  [<ffffffff8819b113>] :auth_rpcgss:gss_get_mic+0x3/0x10
>  RSP <ffff81025c1d1c78>
> 
> 
> kernel: invalid opcode: 0000 [2] SMP 
> CPU 1 
> Modules linked in: des cbc blkcipher nfsd exportfs rpcsec_gss_krb5 auth_rpcgss nfs lockd nfs_acl sunrpc ipv6 dm_snapshot dm_mirror loop dm_round_robin dm_multipath dm_mod ata_piix serio_raw mptfc psmouse scsi_transport_fc evdev ata_generic libata i2c_i801 e1000 i2c_core ehci_hcd uhci_hcd thermal processor
> Pid: 2748, comm: multipathd Tainted: G      D 2.6.23.17 #1
> RIP: 0010:[<ffffffff80291e9d>]  [<ffffffff80291e9d>] cache_alloc_refill+0x1ad/0x540
> RSP: 0018:ffff810259aafe68  EFLAGS: 00010046
> RAX: ffff81025e494040 RBX: 0000000000000020 RCX: 000000000000001c
> RDX: ffff81025e494040 RSI: 000000000000000f RDI: ffff81025e810040
> RBP: ffff81025fc6ec00 R08: ffff81025e810070 R09: 0000000000000000
> R10: 0000000000159849 R11: 0000000000000000 R12: ffff81025fc00680
> R13: ffff81025fc023c0 R14: 0000000000000000 R15: ffff81025fc0d000
> FS:  0000000040061960(0063) GS:ffff81025fc6ce40(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00002ba20d8f4e30 CR3: 0000000259b10000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process multipathd (pid: 2748, threadinfo ffff810259aae000, task ffff81025fd06040)
> Stack:  000000d000000010 ffffffff80291d6a 000000d0000080d0 ffff810259b19000
>  0000000000000296 00000000000080d0 ffff81025fc00680 00007fff9d909298
>  0000000040060e10 ffffffff80291cd2 0000000040062000 ffff810259b19000
> Call Trace:
>  [<ffffffff80291d6a>] cache_alloc_refill+0x7a/0x540
>  [<ffffffff80291cd2>] kmem_cache_alloc+0x82/0xa0
>  [<ffffffff8029bcf6>] do_execve+0x46/0x230
>  [<ffffffff8020ae44>] sys_execve+0x44/0xb0
>  [<ffffffff8020c617>] stub_execve+0x67/0xb0
> 
> 
> Code: 0f 0b eb fe 49 8b 45 10 49 8d 55 10 48 89 78 08 48 89 07 48 
>  RSP <ffff810259aafe68>
> 
> 
> -- 
> Lukáš Hejtmánek
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2008-03-12 16:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-12 12:25 Lukas Hejtmanek
2008-03-12 16:00 ` J. Bruce Fields [this message]
2008-03-13 14:36   ` Lukas Hejtmanek
2008-03-14 18:14     ` J. Bruce Fields
2008-03-14 19:33       ` J. Bruce Fields
2008-03-14 19:53         ` Lukas Hejtmanek
2008-03-14 20:05           ` J. Bruce Fields
2008-03-14 23:37             ` [PATCH] nfsd: fix oops on access from high-numbered ports J. Bruce Fields
2008-03-14 23:42             ` Oops in NFSv4 server in 2.6.23.17 J. Bruce Fields
2008-03-16 23:19               ` Neil Brown
2008-03-19 22:32                 ` J. Bruce Fields
2008-03-20  2:31                   ` NeilBrown
2008-03-17  8:12             ` Lukas Hejtmanek
2008-03-17 13:15               ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080312160007.GA10015@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=nfsv4@linux-nfs.org \
    --cc=xhejtman@ics.muni.cz \
    --subject='Re: Oops in NFSv4 server in 2.6.23.17' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).