LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
@ 2006-08-08 14:39 Jesper Juhl
2006-08-09 5:53 ` Grant Coady
2006-08-17 6:49 ` [NFS] " Neil Brown
0 siblings, 2 replies; 17+ messages in thread
From: Jesper Juhl @ 2006-08-08 14:39 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: Trond Myklebust, nfs
I have some webservers that have recently started reporting the
following message in their logs :
do_vfs_lock: VFS is out of sync with lock manager!
The serveres kernels were upgraded to 2.6.17.8 and since the upgrade
the message started appearing.
The servers were previously running 2.6.13.4 without experiencing this problem.
Nothing has changed except the kernel.
I've googled a bit and found this mail
(http://lkml.org/lkml/2005/8/23/254) from Trond saying that
"The above is a lockd error that states that the VFS is failing to track
your NFS locks correctly".
Ok, but that doesn't really help me resolve the issue. The servers are
indeed running NFS and access their apache DocumentRoots from a NFS
mount.
Is there anything I can do to help track down this issue?
--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-08 14:39 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager! Jesper Juhl
@ 2006-08-09 5:53 ` Grant Coady
2006-08-09 8:07 ` Jesper Juhl
2006-08-17 6:49 ` [NFS] " Neil Brown
1 sibling, 1 reply; 17+ messages in thread
From: Grant Coady @ 2006-08-09 5:53 UTC (permalink / raw)
To: Jesper Juhl; +Cc: Linux Kernel Mailing List, Trond Myklebust, nfs
On Tue, 8 Aug 2006 16:39:54 +0200, "Jesper Juhl" <jesper.juhl@gmail.com> wrote:
>I have some webservers that have recently started reporting the
>following message in their logs :
>
> do_vfs_lock: VFS is out of sync with lock manager!
>
>The serveres kernels were upgraded to 2.6.17.8 and since the upgrade
>the message started appearing.
>The servers were previously running 2.6.13.4 without experiencing this problem.
>Nothing has changed except the kernel.
>
>I've googled a bit and found this mail
>(http://lkml.org/lkml/2005/8/23/254) from Trond saying that
>"The above is a lockd error that states that the VFS is failing to track
>your NFS locks correctly".
>Ok, but that doesn't really help me resolve the issue. The servers are
>indeed running NFS and access their apache DocumentRoots from a NFS
>mount.
>
>Is there anything I can do to help track down this issue?
I don't have an answer, but offer this observation: five boxen running
2.6.17.8 doing six simultaneous
bzcat /home/share/linux-2.6/patch-2.6.18-rc4.bz2|patch -p1
didn't burp. The /home/share/ is an NFS export from another box running
2.4.33-rc3a, me not sure if this was exercising any NFS locking as the
NFS source file was only opened for non-exclusive read-only.
Grant.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-09 5:53 ` Grant Coady
@ 2006-08-09 8:07 ` Jesper Juhl
2006-08-10 22:37 ` Jesper Juhl
0 siblings, 1 reply; 17+ messages in thread
From: Jesper Juhl @ 2006-08-09 8:07 UTC (permalink / raw)
To: Grant Coady; +Cc: Linux Kernel Mailing List, Trond Myklebust, nfs
On 09/08/06, Grant Coady <gcoady.lk@gmail.com> wrote:
> On Tue, 8 Aug 2006 16:39:54 +0200, "Jesper Juhl" <jesper.juhl@gmail.com> wrote:
>
> >I have some webservers that have recently started reporting the
> >following message in their logs :
> >
> > do_vfs_lock: VFS is out of sync with lock manager!
> >
> >The serveres kernels were upgraded to 2.6.17.8 and since the upgrade
> >the message started appearing.
> >The servers were previously running 2.6.13.4 without experiencing this problem.
> >Nothing has changed except the kernel.
> >
> >I've googled a bit and found this mail
> >(http://lkml.org/lkml/2005/8/23/254) from Trond saying that
> >"The above is a lockd error that states that the VFS is failing to track
> >your NFS locks correctly".
> >Ok, but that doesn't really help me resolve the issue. The servers are
> >indeed running NFS and access their apache DocumentRoots from a NFS
> >mount.
> >
> >Is there anything I can do to help track down this issue?
>
> I don't have an answer, but offer this observation: five boxen running
> 2.6.17.8 doing six simultaneous
>
> bzcat /home/share/linux-2.6/patch-2.6.18-rc4.bz2|patch -p1
>
> didn't burp. The /home/share/ is an NFS export from another box running
> 2.4.33-rc3a, me not sure if this was exercising any NFS locking as the
> NFS source file was only opened for non-exclusive read-only.
>
The NFS server here is running 2.6.11.11 and doesn't seem to be
reporting any problems. But I now have two more of my webservers (both
running 2.6.17.8) that have started to complain about "do_vfs_lock:
VFS is out of sync with lock manager!"
I've not found a way to cause the message to be repported at will unfortunately.
--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-09 8:07 ` Jesper Juhl
@ 2006-08-10 22:37 ` Jesper Juhl
2006-08-11 0:30 ` Grant Coady
2006-08-13 23:08 ` Grant Coady
0 siblings, 2 replies; 17+ messages in thread
From: Jesper Juhl @ 2006-08-10 22:37 UTC (permalink / raw)
To: Grant Coady; +Cc: Linux Kernel Mailing List, Trond Myklebust, nfs
On 09/08/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
> On 09/08/06, Grant Coady <gcoady.lk@gmail.com> wrote:
> > On Tue, 8 Aug 2006 16:39:54 +0200, "Jesper Juhl" <jesper.juhl@gmail.com> wrote:
> >
> > >I have some webservers that have recently started reporting the
> > >following message in their logs :
> > >
> > > do_vfs_lock: VFS is out of sync with lock manager!
> > >
> > >The serveres kernels were upgraded to 2.6.17.8 and since the upgrade
> > >the message started appearing.
> > >The servers were previously running 2.6.13.4 without experiencing this problem.
> > >Nothing has changed except the kernel.
> > >
> > >I've googled a bit and found this mail
> > >(http://lkml.org/lkml/2005/8/23/254) from Trond saying that
> > >"The above is a lockd error that states that the VFS is failing to track
> > >your NFS locks correctly".
> > >Ok, but that doesn't really help me resolve the issue. The servers are
> > >indeed running NFS and access their apache DocumentRoots from a NFS
> > >mount.
> > >
> > >Is there anything I can do to help track down this issue?
> >
> > I don't have an answer, but offer this observation: five boxen running
> > 2.6.17.8 doing six simultaneous
> >
> > bzcat /home/share/linux-2.6/patch-2.6.18-rc4.bz2|patch -p1
> >
> > didn't burp. The /home/share/ is an NFS export from another box running
> > 2.4.33-rc3a, me not sure if this was exercising any NFS locking as the
> > NFS source file was only opened for non-exclusive read-only.
> >
> The NFS server here is running 2.6.11.11 and doesn't seem to be
> reporting any problems. But I now have two more of my webservers (both
> running 2.6.17.8) that have started to complain about "do_vfs_lock:
> VFS is out of sync with lock manager!"
>
> I've not found a way to cause the message to be repported at will unfortunately.
>
Today 3 more of my webservers running 2.6.17.8 reported this message.
The machines all seem to be running fine still, so it doesn't seem to
be a serious problem, but it would still be nice to get it fixed ;)
--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-10 22:37 ` Jesper Juhl
@ 2006-08-11 0:30 ` Grant Coady
2006-08-13 23:08 ` Grant Coady
1 sibling, 0 replies; 17+ messages in thread
From: Grant Coady @ 2006-08-11 0:30 UTC (permalink / raw)
To: Jesper Juhl; +Cc: Linux Kernel Mailing List, Trond Myklebust, nfs
On Fri, 11 Aug 2006 00:37:35 +0200, "Jesper Juhl" <jesper.juhl@gmail.com> wrote:
>On 09/08/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
>> On 09/08/06, Grant Coady <gcoady.lk@gmail.com> wrote:
>> > On Tue, 8 Aug 2006 16:39:54 +0200, "Jesper Juhl" <jesper.juhl@gmail.com> wrote:
>> >
>> > >I have some webservers that have recently started reporting the
>> > >following message in their logs :
>> > >
>> > > do_vfs_lock: VFS is out of sync with lock manager!
>> > >
>> > >The serveres kernels were upgraded to 2.6.17.8 and since the upgrade
>> > >the message started appearing.
>> > >The servers were previously running 2.6.13.4 without experiencing this problem.
>> > >Nothing has changed except the kernel.
>> > >
>> > >I've googled a bit and found this mail
>> > >(http://lkml.org/lkml/2005/8/23/254) from Trond saying that
>> > >"The above is a lockd error that states that the VFS is failing to track
>> > >your NFS locks correctly".
>> > >Ok, but that doesn't really help me resolve the issue. The servers are
>> > >indeed running NFS and access their apache DocumentRoots from a NFS
>> > >mount.
>> > >
>> > >Is there anything I can do to help track down this issue?
>> >
>> > I don't have an answer, but offer this observation: five boxen running
>> > 2.6.17.8 doing six simultaneous
>> >
>> > bzcat /home/share/linux-2.6/patch-2.6.18-rc4.bz2|patch -p1
>> >
>> > didn't burp. The /home/share/ is an NFS export from another box running
>> > 2.4.33-rc3a, me not sure if this was exercising any NFS locking as the
>> > NFS source file was only opened for non-exclusive read-only.
>> >
>> The NFS server here is running 2.6.11.11 and doesn't seem to be
>> reporting any problems. But I now have two more of my webservers (both
>> running 2.6.17.8) that have started to complain about "do_vfs_lock:
>> VFS is out of sync with lock manager!"
>>
>> I've not found a way to cause the message to be repported at will unfortunately.
>>
>Today 3 more of my webservers running 2.6.17.8 reported this message.
>The machines all seem to be running fine still, so it doesn't seem to
>be a serious problem, but it would still be nice to get it fixed ;)
Do you have some test case other than web server that triggers it?
I can try it here, recently did much NFS testing reported under
2.4.33-rc1 or -rc2 on lkml to try sort a vfs_unlink issue.
My web server here is low volume, not a good test situation -- plus
it no longer runs under 2.6 and I don't want to take it down too often
to sort out why. Once it could dual boot 2.4 or 2.6 without trouble,
but that option fell apart when 2.6.16 came out.
Grant.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-10 22:37 ` Jesper Juhl
2006-08-11 0:30 ` Grant Coady
@ 2006-08-13 23:08 ` Grant Coady
1 sibling, 0 replies; 17+ messages in thread
From: Grant Coady @ 2006-08-13 23:08 UTC (permalink / raw)
To: Jesper Juhl; +Cc: Linux Kernel Mailing List, Trond Myklebust, nfs
On Fri, 11 Aug 2006 00:37:35 +0200, "Jesper Juhl" <jesper.juhl@gmail.com> wrote:
>On 09/08/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
>> On 09/08/06, Grant Coady <gcoady.lk@gmail.com> wrote:
>> > On Tue, 8 Aug 2006 16:39:54 +0200, "Jesper Juhl" <jesper.juhl@gmail.com> wrote:
>> >
>> > >I have some webservers that have recently started reporting the
>> > >following message in their logs :
>> > >
>> > > do_vfs_lock: VFS is out of sync with lock manager!
>> > >
>> > >The serveres kernels were upgraded to 2.6.17.8 and since the upgrade
>> > >the message started appearing.
>> > >The servers were previously running 2.6.13.4 without experiencing this problem.
>> > >Nothing has changed except the kernel.
>> > >
>> > >I've googled a bit and found this mail
>> > >(http://lkml.org/lkml/2005/8/23/254) from Trond saying that
>> > >"The above is a lockd error that states that the VFS is failing to track
>> > >your NFS locks correctly".
>> > >Ok, but that doesn't really help me resolve the issue. The servers are
>> > >indeed running NFS and access their apache DocumentRoots from a NFS
>> > >mount.
>> > >
>> > >Is there anything I can do to help track down this issue?
>> >
>> > I don't have an answer, but offer this observation: five boxen running
>> > 2.6.17.8 doing six simultaneous
>> >
>> > bzcat /home/share/linux-2.6/patch-2.6.18-rc4.bz2|patch -p1
>> >
>> > didn't burp. The /home/share/ is an NFS export from another box running
>> > 2.4.33-rc3a, me not sure if this was exercising any NFS locking as the
>> > NFS source file was only opened for non-exclusive read-only.
>> >
>> The NFS server here is running 2.6.11.11 and doesn't seem to be
>> reporting any problems. But I now have two more of my webservers (both
>> running 2.6.17.8) that have started to complain about "do_vfs_lock:
>> VFS is out of sync with lock manager!"
>>
>> I've not found a way to cause the message to be repported at will unfortunately.
>>
>Today 3 more of my webservers running 2.6.17.8 reported this message.
>The machines all seem to be running fine still, so it doesn't seem to
>be a serious problem, but it would still be nice to get it fixed ;)
I'm running continuous kernel 2.4.33 rebuild from make mrproper plus
another console extracting tarball, diff tree against last_extracted,
on pair of 2.6.17.8 boxen overnight with NFS TCP support, no problems,
now testing without TCP support. Report again only if I see problems.
Let me know if you want to see test scripts.
Grant.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-08 14:39 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager! Jesper Juhl
2006-08-09 5:53 ` Grant Coady
@ 2006-08-17 6:49 ` Neil Brown
2006-08-17 9:58 ` Jesper Juhl
1 sibling, 1 reply; 17+ messages in thread
From: Neil Brown @ 2006-08-17 6:49 UTC (permalink / raw)
To: Jesper Juhl; +Cc: Linux Kernel Mailing List, nfs, Trond Myklebust
On Tuesday August 8, jesper.juhl@gmail.com wrote:
> I have some webservers that have recently started reporting the
> following message in their logs :
>
> do_vfs_lock: VFS is out of sync with lock manager!
I can imagine that happening if you mount with '-o nolocks'.
Then a non-blocking lock could cause that message (I think).
Can you conform that you aren't using 'nolocks'.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-17 6:49 ` [NFS] " Neil Brown
@ 2006-08-17 9:58 ` Jesper Juhl
2006-08-21 3:34 ` Neil Brown
0 siblings, 1 reply; 17+ messages in thread
From: Jesper Juhl @ 2006-08-17 9:58 UTC (permalink / raw)
To: Neil Brown; +Cc: Linux Kernel Mailing List, nfs, Trond Myklebust
On 17/08/06, Neil Brown <neilb@suse.de> wrote:
> On Tuesday August 8, jesper.juhl@gmail.com wrote:
> > I have some webservers that have recently started reporting the
> > following message in their logs :
> >
> > do_vfs_lock: VFS is out of sync with lock manager!
>
>
> I can imagine that happening if you mount with '-o nolocks'.
> Then a non-blocking lock could cause that message (I think).
> Can you conform that you aren't using 'nolocks'.
>
Confirmed.
All my webservers (the ones that generate this message) are identical
and this is the filesystems they have and their mount options:
/ is on a local scsi disk, ext3 fs, mounted with (rw)
/boot is on a local scsi disk, ext3 fs, mounted with (rw)
users homedirs (where the DocumentRoots are) are NFS mounted with
mount options (rw,rsize=8192,wsize=8192,hard,intr,addr=xxx.xxx.xxx.xxx)
--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-17 9:58 ` Jesper Juhl
@ 2006-08-21 3:34 ` Neil Brown
2006-08-21 19:54 ` Trond Myklebust
0 siblings, 1 reply; 17+ messages in thread
From: Neil Brown @ 2006-08-21 3:34 UTC (permalink / raw)
To: Jesper Juhl; +Cc: nfs, Linux Kernel Mailing List, Trond Myklebust
On Thursday August 17, jesper.juhl@gmail.com wrote:
> On 17/08/06, Neil Brown <neilb@suse.de> wrote:
> > On Tuesday August 8, jesper.juhl@gmail.com wrote:
> > > I have some webservers that have recently started reporting the
> > > following message in their logs :
> > >
> > > do_vfs_lock: VFS is out of sync with lock manager!
> >
> >
> > I can imagine that happening if you mount with '-o nolocks'.
> > Then a non-blocking lock could cause that message (I think).
> > Can you conform that you aren't using 'nolocks'.
> >
> Confirmed.
Thanks. I suspected as much but don't like to assume.
I've look more thoroughly at this code and I think the message is
meaningless and can be ignored.
Looking in fs/nfs/file.c (at 2.6.18-rc4-mm1 if it matters, but 2.6.17
is much the same)
- do_vfs_lock is only called when the filesystem was mounted with
-o nolock EXCEPT
- If a lock request to the server in interrupted (when mounted with
-o intr) then do_vfs_lock is called to try to get the lock
locally. Normally equivalent code will be called inside
fs/lockd/clntproc.c when the server replies that the lock has been
gained. In the case of an interrupt though this doesn't happen
but the lock may still have happened on the server. So we record
locally that the lock was gained, to ensure that it gets unlocked
when the process exits.
As you don't have '-o nolocks' you must be hitting the second case.
The lock call to the server returns -EINTR or -ERESTARTSYS and
do_vfs_lock is called just-in-case.
As this is a just-in-case call, it is quite possible that the lock is
held by some other process, so getting an error is entirely possible.
So printing the message in this case seems wrong.
On the other hand, printing the message in any other case seems wrong
too, as server locking is not being used, so there is nothing to get
out of sync with.
As a further complication, I don't think that in the just-in-case
situation that it should risk waiting for the lock.
Now maybe we can be sure there is a pending signal which will break
out of any wait (though I'm worried about -ERESTARTSYS - that doesn't
imply a signal does it?), but I would feel more comfortable if
FL_SLEEP were turned off in that path.
So: Trond: Any obvious errors in the above?
Is the following patch ok?
NeilBrown
Signed-off-by: Neil Brown <neilb@suse.de>
### Diffstat output
./fs/nfs/file.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff .prev/fs/nfs/file.c ./fs/nfs/file.c
--- .prev/fs/nfs/file.c 2006-08-21 13:28:25.000000000 +1000
+++ ./fs/nfs/file.c 2006-08-21 13:30:27.000000000 +1000
@@ -452,9 +452,6 @@ static int do_vfs_lock(struct file *file
default:
BUG();
}
- if (res < 0)
- printk(KERN_WARNING "%s: VFS is out of sync with lock manager!\n",
- __FUNCTION__);
return res;
}
@@ -504,10 +501,13 @@ static int do_setlk(struct file *filp, i
* we clean up any state on the server. We therefore
* record the lock call as having succeeded in order to
* ensure that locks_remove_posix() cleans it out when
- * the process exits.
+ * the process exits. Make sure not to sleep if
+ * someone else holds the lock.
*/
- if (status == -EINTR || status == -ERESTARTSYS)
+ if (status == -EINTR || status == -ERESTARTSYS) {
+ fl->fl_flags &= ~FL_SLEEP;
do_vfs_lock(filp, fl);
+ }
} else
status = do_vfs_lock(filp, fl);
unlock_kernel();
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-21 3:34 ` Neil Brown
@ 2006-08-21 19:54 ` Trond Myklebust
2006-11-21 12:43 ` Jesper Juhl
2007-01-29 5:08 ` Neil Brown
0 siblings, 2 replies; 17+ messages in thread
From: Trond Myklebust @ 2006-08-21 19:54 UTC (permalink / raw)
To: Neil Brown; +Cc: Jesper Juhl, nfs, Linux Kernel Mailing List
On Mon, 2006-08-21 at 13:34 +1000, Neil Brown wrote:
> Looking in fs/nfs/file.c (at 2.6.18-rc4-mm1 if it matters, but 2.6.17
> is much the same)
>
> - do_vfs_lock is only called when the filesystem was mounted with
> -o nolock EXCEPT
> - If a lock request to the server in interrupted (when mounted with
> -o intr) then do_vfs_lock is called to try to get the lock
> locally. Normally equivalent code will be called inside
> fs/lockd/clntproc.c when the server replies that the lock has been
> gained. In the case of an interrupt though this doesn't happen
> but the lock may still have happened on the server. So we record
> locally that the lock was gained, to ensure that it gets unlocked
> when the process exits.
>
> As you don't have '-o nolocks' you must be hitting the second case.
> The lock call to the server returns -EINTR or -ERESTARTSYS and
> do_vfs_lock is called just-in-case.
> As this is a just-in-case call, it is quite possible that the lock is
> held by some other process, so getting an error is entirely possible.
> So printing the message in this case seems wrong.
>
> On the other hand, printing the message in any other case seems wrong
> too, as server locking is not being used, so there is nothing to get
> out of sync with.
>
> As a further complication, I don't think that in the just-in-case
> situation that it should risk waiting for the lock.
> Now maybe we can be sure there is a pending signal which will break
> out of any wait (though I'm worried about -ERESTARTSYS - that doesn't
> imply a signal does it?), but I would feel more comfortable if
> FL_SLEEP were turned off in that path.
>
> So: Trond: Any obvious errors in the above?
> Is the following patch ok?
Could we instead replace it with a dprintk() that returns the value of
"res"? That will keep it useful for debugging purposes.
Cheers,
Trond
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-21 19:54 ` Trond Myklebust
@ 2006-11-21 12:43 ` Jesper Juhl
2006-11-27 9:19 ` Jesper Juhl
2007-01-29 5:08 ` Neil Brown
1 sibling, 1 reply; 17+ messages in thread
From: Jesper Juhl @ 2006-11-21 12:43 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Neil Brown, nfs, Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 3882 bytes --]
On 21/08/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> On Mon, 2006-08-21 at 13:34 +1000, Neil Brown wrote:
> > Looking in fs/nfs/file.c (at 2.6.18-rc4-mm1 if it matters, but 2.6.17
> > is much the same)
> >
> > - do_vfs_lock is only called when the filesystem was mounted with
> > -o nolock EXCEPT
> > - If a lock request to the server in interrupted (when mounted with
> > -o intr) then do_vfs_lock is called to try to get the lock
> > locally. Normally equivalent code will be called inside
> > fs/lockd/clntproc.c when the server replies that the lock has been
> > gained. In the case of an interrupt though this doesn't happen
> > but the lock may still have happened on the server. So we record
> > locally that the lock was gained, to ensure that it gets unlocked
> > when the process exits.
> >
> > As you don't have '-o nolocks' you must be hitting the second case.
> > The lock call to the server returns -EINTR or -ERESTARTSYS and
> > do_vfs_lock is called just-in-case.
> > As this is a just-in-case call, it is quite possible that the lock is
> > held by some other process, so getting an error is entirely possible.
> > So printing the message in this case seems wrong.
> >
> > On the other hand, printing the message in any other case seems wrong
> > too, as server locking is not being used, so there is nothing to get
> > out of sync with.
> >
> > As a further complication, I don't think that in the just-in-case
> > situation that it should risk waiting for the lock.
> > Now maybe we can be sure there is a pending signal which will break
> > out of any wait (though I'm worried about -ERESTARTSYS - that doesn't
> > imply a signal does it?), but I would feel more comfortable if
> > FL_SLEEP were turned off in that path.
> >
> > So: Trond: Any obvious errors in the above?
> > Is the following patch ok?
>
> Could we instead replace it with a dprintk() that returns the value of
> "res"? That will keep it useful for debugging purposes.
>
How about the below?
(compile tested only)
Neil: I left your Signed-off-by line since I just modified your patch slightly.
Since Gmail will probably mangle the inline patch, it is attached as well.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
--
fs/nfs/file.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index cc93865..22572af 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -428,8 +428,8 @@ static int do_vfs_lock(struct file *file
BUG();
}
if (res < 0)
- printk(KERN_WARNING "%s: VFS is out of sync with lock
manager!\n",
- __FUNCTION__);
+ dprintk("%s: VFS is out of sync with lock manager (res
= %d)!\n",
+ __FUNCTION__, res);
return res;
}
@@ -479,10 +479,13 @@ static int do_setlk(struct file *filp, i
* we clean up any state on the server. We therefore
* record the lock call as having succeeded in order to
* ensure that locks_remove_posix() cleans it out when
- * the process exits.
+ * the process exits. Make sure not to sleep if
+ * someone else holds the lock.
*/
- if (status == -EINTR || status == -ERESTARTSYS)
+ if (status == -EINTR || status == -ERESTARTSYS) {
+ fl->fl_flags &= ~FL_SLEEP;
do_vfs_lock(filp, fl);
+ }
} else
status = do_vfs_lock(filp, fl);
unlock_kernel();
--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
[-- Attachment #2: VFS_is_out_of_sync_with_lock_manager.diff --]
[-- Type: text/x-patch, Size: 1176 bytes --]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
--
fs/nfs/file.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index cc93865..22572af 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -428,8 +428,8 @@ static int do_vfs_lock(struct file *file
BUG();
}
if (res < 0)
- printk(KERN_WARNING "%s: VFS is out of sync with lock manager!\n",
- __FUNCTION__);
+ dprintk("%s: VFS is out of sync with lock manager (res = %d)!\n",
+ __FUNCTION__, res);
return res;
}
@@ -479,10 +479,13 @@ static int do_setlk(struct file *filp, i
* we clean up any state on the server. We therefore
* record the lock call as having succeeded in order to
* ensure that locks_remove_posix() cleans it out when
- * the process exits.
+ * the process exits. Make sure not to sleep if
+ * someone else holds the lock.
*/
- if (status == -EINTR || status == -ERESTARTSYS)
+ if (status == -EINTR || status == -ERESTARTSYS) {
+ fl->fl_flags &= ~FL_SLEEP;
do_vfs_lock(filp, fl);
+ }
} else
status = do_vfs_lock(filp, fl);
unlock_kernel();
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-11-21 12:43 ` Jesper Juhl
@ 2006-11-27 9:19 ` Jesper Juhl
0 siblings, 0 replies; 17+ messages in thread
From: Jesper Juhl @ 2006-11-27 9:19 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Neil Brown, nfs, Linux Kernel Mailing List
Any chance we could get the patch below (or something similar) pushed
into 2.6.19?
On 21/11/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
> On 21/08/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > On Mon, 2006-08-21 at 13:34 +1000, Neil Brown wrote:
> > > Looking in fs/nfs/file.c (at 2.6.18-rc4-mm1 if it matters, but 2.6.17
> > > is much the same)
> > >
> > > - do_vfs_lock is only called when the filesystem was mounted with
> > > -o nolock EXCEPT
> > > - If a lock request to the server in interrupted (when mounted with
> > > -o intr) then do_vfs_lock is called to try to get the lock
> > > locally. Normally equivalent code will be called inside
> > > fs/lockd/clntproc.c when the server replies that the lock has been
> > > gained. In the case of an interrupt though this doesn't happen
> > > but the lock may still have happened on the server. So we record
> > > locally that the lock was gained, to ensure that it gets unlocked
> > > when the process exits.
> > >
> > > As you don't have '-o nolocks' you must be hitting the second case.
> > > The lock call to the server returns -EINTR or -ERESTARTSYS and
> > > do_vfs_lock is called just-in-case.
> > > As this is a just-in-case call, it is quite possible that the lock is
> > > held by some other process, so getting an error is entirely possible.
> > > So printing the message in this case seems wrong.
> > >
> > > On the other hand, printing the message in any other case seems wrong
> > > too, as server locking is not being used, so there is nothing to get
> > > out of sync with.
> > >
> > > As a further complication, I don't think that in the just-in-case
> > > situation that it should risk waiting for the lock.
> > > Now maybe we can be sure there is a pending signal which will break
> > > out of any wait (though I'm worried about -ERESTARTSYS - that doesn't
> > > imply a signal does it?), but I would feel more comfortable if
> > > FL_SLEEP were turned off in that path.
> > >
> > > So: Trond: Any obvious errors in the above?
> > > Is the following patch ok?
> >
> > Could we instead replace it with a dprintk() that returns the value of
> > "res"? That will keep it useful for debugging purposes.
> >
>
> How about the below?
> (compile tested only)
>
> Neil: I left your Signed-off-by line since I just modified your patch slightly.
>
> Since Gmail will probably mangle the inline patch, it is attached as well.
>
>
> Signed-off-by: Neil Brown <neilb@suse.de>
> Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
> --
>
> fs/nfs/file.c | 11 +++++++----
> 1 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/fs/nfs/file.c b/fs/nfs/file.c
> index cc93865..22572af 100644
> --- a/fs/nfs/file.c
> +++ b/fs/nfs/file.c
> @@ -428,8 +428,8 @@ static int do_vfs_lock(struct file *file
> BUG();
> }
> if (res < 0)
> - printk(KERN_WARNING "%s: VFS is out of sync with lock
> manager!\n",
> - __FUNCTION__);
> + dprintk("%s: VFS is out of sync with lock manager (res
> = %d)!\n",
> + __FUNCTION__, res);
> return res;
> }
>
> @@ -479,10 +479,13 @@ static int do_setlk(struct file *filp, i
> * we clean up any state on the server. We therefore
> * record the lock call as having succeeded in order to
> * ensure that locks_remove_posix() cleans it out when
> - * the process exits.
> + * the process exits. Make sure not to sleep if
> + * someone else holds the lock.
> */
> - if (status == -EINTR || status == -ERESTARTSYS)
> + if (status == -EINTR || status == -ERESTARTSYS) {
> + fl->fl_flags &= ~FL_SLEEP;
> do_vfs_lock(filp, fl);
> + }
> } else
> status = do_vfs_lock(filp, fl);
> unlock_kernel();
>
--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2006-08-21 19:54 ` Trond Myklebust
2006-11-21 12:43 ` Jesper Juhl
@ 2007-01-29 5:08 ` Neil Brown
2007-01-29 14:16 ` Trond Myklebust
1 sibling, 1 reply; 17+ messages in thread
From: Neil Brown @ 2007-01-29 5:08 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Jesper Juhl, nfs, Linux Kernel Mailing List
On Monday August 21, trond.myklebust@fys.uio.no wrote:
>
> Could we instead replace it with a dprintk() that returns the value of
> "res"? That will keep it useful for debugging purposes.
(only 5 months later...)
Sure, how about this?
Thanks,
NeilBrown
Remove warning: VFS is out of sync with lock manager.
But keep it as a dprintk
The message can be generated in a quite normal situation:
If a 'lock' request is interrupted, then the lock client needs to
record that the server has the lock, incase it does.
When we come the unlock, the server might say it doesn't, even
though we think it does (or might) and this generates the message.
Signed-off-by: Neil Brown <neilb@suse.de>
### Diffstat output
./fs/nfs/file.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff .prev/fs/nfs/file.c ./fs/nfs/file.c
--- .prev/fs/nfs/file.c 2007-01-29 16:04:09.000000000 +1100
+++ ./fs/nfs/file.c 2007-01-29 16:04:45.000000000 +1100
@@ -434,8 +434,9 @@ static int do_vfs_lock(struct file *file
BUG();
}
if (res < 0)
- printk(KERN_WARNING "%s: VFS is out of sync with lock manager!\n",
- __FUNCTION__);
+ dprintk(KERN_WARNING "%s: VFS is out of sync with lock manager"
+ " - error %d!\n",
+ __FUNCTION__, res);
return res;
}
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2007-01-29 5:08 ` Neil Brown
@ 2007-01-29 14:16 ` Trond Myklebust
2007-01-30 23:42 ` Jesper Juhl
0 siblings, 1 reply; 17+ messages in thread
From: Trond Myklebust @ 2007-01-29 14:16 UTC (permalink / raw)
To: Neil Brown; +Cc: Jesper Juhl, nfs, Linux Kernel Mailing List
On Mon, 2007-01-29 at 16:08 +1100, Neil Brown wrote:
> On Monday August 21, trond.myklebust@fys.uio.no wrote:
> >
> > Could we instead replace it with a dprintk() that returns the value of
> > "res"? That will keep it useful for debugging purposes.
>
> (only 5 months later...)
>
> Sure, how about this?
>
> Thanks,
> NeilBrown
ACKed.
Thanks Neil!
Trond
>
> Remove warning: VFS is out of sync with lock manager.
>
> But keep it as a dprintk
>
> The message can be generated in a quite normal situation:
> If a 'lock' request is interrupted, then the lock client needs to
> record that the server has the lock, incase it does.
> When we come the unlock, the server might say it doesn't, even
> though we think it does (or might) and this generates the message.
>
> Signed-off-by: Neil Brown <neilb@suse.de>
>
> ### Diffstat output
> ./fs/nfs/file.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff .prev/fs/nfs/file.c ./fs/nfs/file.c
> --- .prev/fs/nfs/file.c 2007-01-29 16:04:09.000000000 +1100
> +++ ./fs/nfs/file.c 2007-01-29 16:04:45.000000000 +1100
> @@ -434,8 +434,9 @@ static int do_vfs_lock(struct file *file
> BUG();
> }
> if (res < 0)
> - printk(KERN_WARNING "%s: VFS is out of sync with lock manager!\n",
> - __FUNCTION__);
> + dprintk(KERN_WARNING "%s: VFS is out of sync with lock manager"
> + " - error %d!\n",
> + __FUNCTION__, res);
> return res;
> }
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2007-01-29 14:16 ` Trond Myklebust
@ 2007-01-30 23:42 ` Jesper Juhl
2007-02-01 22:39 ` Neil Brown
0 siblings, 1 reply; 17+ messages in thread
From: Jesper Juhl @ 2007-01-30 23:42 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Neil Brown, nfs, Linux Kernel Mailing List
On 29/01/07, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> On Mon, 2007-01-29 at 16:08 +1100, Neil Brown wrote:
> > On Monday August 21, trond.myklebust@fys.uio.no wrote:
> > >
> > > Could we instead replace it with a dprintk() that returns the value of
> > > "res"? That will keep it useful for debugging purposes.
> >
> > (only 5 months later...)
> >
> > Sure, how about this?
> >
> > Thanks,
> > NeilBrown
>
> ACKed.
>
> Thanks Neil!
>
Finally getting that in will be sooooo nice :-) Thank you.
Btw: any reason why not to include the
fl->fl_flags &= ~FL_SLEEP;
bit as well? As in http://lkml.org/lkml/2006/11/27/41 ??
--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
2007-01-30 23:42 ` Jesper Juhl
@ 2007-02-01 22:39 ` Neil Brown
0 siblings, 0 replies; 17+ messages in thread
From: Neil Brown @ 2007-02-01 22:39 UTC (permalink / raw)
To: Jesper Juhl; +Cc: Trond Myklebust, nfs, Linux Kernel Mailing List
On Wednesday January 31, jesper.juhl@gmail.com wrote:
> On 29/01/07, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
>
> Finally getting that in will be sooooo nice :-) Thank you.
>
> Btw: any reason why not to include the
> fl->fl_flags &= ~FL_SLEEP;
> bit as well? As in http://lkml.org/lkml/2006/11/27/41 ??
>
Uhmm... I guess I had forgotten it ....
Looking again, I cannot convince myself that it is needed.
The comment says "If we were signalled ...", and if we were signalled
then do_vfs_lock won't block anyway. I think I was probably being
over-cautious as I didn't know the significance of testing for
-ERESTARTSYS.
However it seems to only get returned if a signal is pending (so why
EINTR isn't returned I still don't know) so a signal must be pending
on that piece of code, so there is no need to clear FL_SLEEP.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!
@ 2006-08-12 8:58 Chuck Ebbert
0 siblings, 0 replies; 17+ messages in thread
From: Chuck Ebbert @ 2006-08-12 8:58 UTC (permalink / raw)
To: Jesper Juhl; +Cc: Grant Coady, linux-kernel, Trond Myklebust
In-Reply-To: <9a8748490608101537y4c377fb3xcd8babbdbc29cee2@mail.gmail.com>
On Fri, 11 Aug 2006 00:37:35 +0200, Jesper Juhl wrote:
> > > >I have some webservers that have recently started reporting the
> > > >following message in their logs :
> > > >
> > > > do_vfs_lock: VFS is out of sync with lock manager!
What does this (not even compile tested) patch print?
--- 2.6.17.8-nb/fs/lockd/clntproc.c 2006-06-10 17:39:21.000000000 -0400
+++ 2.6.17.8-nb/fs/lockd/clntproc.c.new 2006-08-12 04:43:45.000000000 -0400
@@ -458,7 +458,9 @@ static void nlmclnt_locks_init_private(s
static void do_vfs_lock(struct file_lock *fl)
{
int res = 0;
- switch (fl->fl_flags & (FL_POSIX|FL_FLOCK)) {
+ unsigned char flags = fl->fl_flags & (FL_POSIX|FL_FLOCK);
+
+ switch (flags) {
case FL_POSIX:
res = posix_lock_file_wait(fl->fl_file, fl);
break;
@@ -469,8 +471,8 @@ static void do_vfs_lock(struct file_lock
BUG();
}
if (res < 0)
- printk(KERN_WARNING "%s: VFS is out of sync with lock manager!\n",
- __FUNCTION__);
+ printk(KERN_WARNING "%s: VFS is out of sync with lock manager! -- %s: %d\n",
+ __FUNCTION__, flags == FL_POSIX ? "POSIX" : "FLOCK", res);
}
/*
--
Chuck
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-02-01 22:39 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-08 14:39 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager! Jesper Juhl
2006-08-09 5:53 ` Grant Coady
2006-08-09 8:07 ` Jesper Juhl
2006-08-10 22:37 ` Jesper Juhl
2006-08-11 0:30 ` Grant Coady
2006-08-13 23:08 ` Grant Coady
2006-08-17 6:49 ` [NFS] " Neil Brown
2006-08-17 9:58 ` Jesper Juhl
2006-08-21 3:34 ` Neil Brown
2006-08-21 19:54 ` Trond Myklebust
2006-11-21 12:43 ` Jesper Juhl
2006-11-27 9:19 ` Jesper Juhl
2007-01-29 5:08 ` Neil Brown
2007-01-29 14:16 ` Trond Myklebust
2007-01-30 23:42 ` Jesper Juhl
2007-02-01 22:39 ` Neil Brown
2006-08-12 8:58 Chuck Ebbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).