LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Gianluca Alberici <gianluca@abinetworks.biz>
To: Chuck Lever <chuck.lever@oracle.com>,
	linux-kernel@vger.kernel.org,
	NFS list <linux-nfs@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9
Date: Wed, 06 Feb 2008 19:25:21 +0100	[thread overview]
Message-ID: <47A9FB91.2040304@abinetworks.biz> (raw)
In-Reply-To: <AB6DF303-5C5B-453C-9992-DAF826FF02CE@oracle.com>

Hello all,

Thanks to Chuck's help i finally decided to proceed to a git bisect and 
found the bad patch. Is there anybody that has an idea why it breaks 
userspace nfs servers as we have seen ? Sorry for emailing directly 
Chuck Lever and Andrew Morton but i really wanted to thank Chuck for his 
precious help and thought that /akpm/ having signed this commit maybe 
he's going to figure out whats wrong easily

This is what i finally get from git:

1c710c896eb461895d3c399e15bb5f20b39c9073 is first bad commit
commit 1c710c896eb461895d3c399e15bb5f20b39c9073
Author: Ulrich Drepper <drepper@redhat.com>
Date:   Tue May 8 00:33:25 2007 -0700

    utimensat implementation

    Implement utimensat(2) which is an extension to futimesat(2) in that it

    a) supports nano-second resolution for the timestamps
    b) allows to selectively ignore the atime/mtime value
    c) allows to selectively use the current time for either atime or mtime
    d) supports changing the atime/mtime of a symlink itself along the lines
       of the BSD lutimes(3) functions

[...]

    [akpm@linux-foundation.org: add missing i386 syscall table entry]
    Signed-off-by: Ulrich Drepper <drepper@redhat.com>
    Cc: Alexey Dobriyan <adobriyan@openvz.org>
    Cc: Michael Kerrisk <mtk-manpages@gmx.net>
    Cc: <linux-arch@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

:040000 040000 3bedbc7fd919ba167b8e5f208a630261570853bb 927002a9423dcb51ba4f7bee53e60cdca6c1df43 M      arch
:040000 040000 fd688c5b534efd3111cbf1e1095d6ff631738325 3d0fbf20fb3da1cb380c92f5b2b39815897376d3 M      fs
:040000 040000 bfb1a907a9a842db4fa3543e12a8381d4e11b1eb 9c1d99324db12e066c0d17870fe48457809ad43b M      include

Thanks in advance, regards,

Gianluca

> Hi Gianluca-
>
> On Jan 30, 2008, at 7:40 AM, Gianluca Alberici wrote:
>
>> Hello again everybody
>>
>> Here follows the testbench:
>>
>> - I got two mirrors, same machine, same disk etc...chaged hostname, 
>> IP, and on the second i have recompiled kernel.
>> - First: 2.6.21.7 on debian sarge
>> - Second: 2.6.22 same system.
>> - Onto both i got nfs-user-server and cfsd last versions
>> - The export file is the same (localhost /opt/nfs (rw, async), 
>> stripping off the async option does not changes anything)
>> - Mount options are exactly the same.
>>
>> The problem arises in the very same manner with both nfs and cfsd:
>>
>> NFS:setattr {
>> ...
>> ...
>> RPC:call_decode {
>> return 22;
>> }
>> ...
>> return 22;
>> }
>
>
> Again, there is nothing wrong with the RPC client or call_decode. The 
> *server* is returning NFSERR_INVAL (22) to a SETATTR request; the RPC 
> client is simply passing that along to the NFS client, as it is 
> designed to do.
>
>> I have tried these kernels:
>>
>> 2.6.16.11 works
>> 2.6.20 works
>> 2.6.21 works
>> 2.6.21.7 works
>> 2.6.22 doesnt work (contiguous to previous version)
>> 2.6.23 doesnt work (same behavior as previous)
>> 2.6.23.9 doesnt work (as above)
>> 2.6.24rc7 doesnt work (as above)
>>
>> I would really like to do more, client or server side, if you ave any 
>> suggestions.
>> Can we find out what is the change (doesnt matter if it is a buf or 
>> bug fix) that caused this problem ?
>
>
> The goal here is to identify the kernel change between 2.6.21 and 
> 2.6.22 that makes the client generate SETATTR requests the user-space 
> server chokes on. It may be a change in the NFS client, or it could be 
> somewhere else in the file system stack, like the VFS.
>
> The usual procedure is to use "git bisect". It does a binary search on 
> the kernel patches between the working kernel version and the kernel 
> version that is known not to work. It works like this:
>
> 1. You clone a linux kernel git repository (if you don't have a git
> repository already)
>
> 2. You tell git bisect which kernel version is working, and which isn't.
> git bisect then selects a commit about half way in between the working
> and non-working versions, and checks out that version of the kernel
>
> 3. You build that kernel, and run your test case
>
> 4. You tell git bisect whether the resulting kernel passes your test 
> case,
> it selects a new commit, and checks out that version of the kernel.
>
> 5. Repeat steps 3 and 4 until git bisect has identified the commit that
> causes the kernel to stop passing your test case
>
> If the number of patches between 2.6.21 and 2.6.22 is N, then git 
> bisect will find the faulty patch in O(log2(N)) steps. For example, if 
> there are 250 patches between 2.6.21 and 2.6.22, it will take about 8 
> iterations of steps 3 and 4 to find the faulty patch, if all goes 
> well; far fewer than the total number of patches you would need to 
> test one at a time.
>
> Naturally you can also do this by applying and reverting patches with 
> "patch -p1", but it's a little more work.
>
>> Chuck Lever wrote:
>>
>>> On Jan 29, 2008, at 3:31 PM, Trond Myklebust wrote:
>>>
>>>> On Tue, 2008-01-29 at 20:50 +0100, Gianluca Alberici wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I confirm that i have encountered this same problem (EINVAL on open
>>>>> (...O | TRUNC) with the following userspace servers:
>>>>>
>>>>> - nfs-user-server shipped with debian sarge/etch etc...
>>>>> - cfsd (crypto file system which is an nfs server)
>>>>>
>>>>> I want to underline again that these userspace servers have been 
>>>>> woking
>>>>> perfectly until 2.6.21.7 (which is the last 2.6.21)
>>>>> Since 2.6.22 the problem came out and it is still present into 2.6.24
>>>>> rc7 (last i tested). Conclusion: there must have been something 
>>>>> that is
>>>>> changed in 2.6.22 that caused the problem.
>>>>
>>>>
>>>>
>>>> The only difference between these two dumps are the fact that the 
>>>> first
>>>> one isn't using the Sun convention for telling NFSv2 servers to set to
>>>> the current time (see the code in xdr_encode_current_server_time).
>>>
>>>
>>>
>>> I thought I saw that on both SETATTRs, but I could be wrong.
>>>
>>>> I don't see why this would be new behaviour after 2.6.21. The code for
>>>> this has been in the NFS client since 2.6.15 at least...
>>>
>>>
>>>
>>>
>>> A mount option is set on one test client, and not the other, perhaps?
>>>
>>> -- 
>>> Chuck Lever
>>> chuck[dot]lever[at]oracle[dot]com
>>> -
>>> To unsubscribe from this list: send the line "unsubscribe linux- 
>>> nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux- nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> -- 
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html



  parent reply	other threads:[~2008-02-06 18:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-22 10:52 Gianluca Alberici
2007-12-23  7:17 ` Scott
2007-12-23 11:35   ` Gianluca Alberici
2007-12-25 22:04     ` Andrew Morton
2007-12-26 11:14       ` Gianluca Alberici
2007-12-26 14:24       ` Chuck Lever
2008-01-19 11:29         ` Gianluca Alberici
2008-01-19 12:35         ` Gianluca Alberici
     [not found]           ` <5FD6714F-EF9A-4F07-B2B6-D6F6CC911936@oracle.com>
     [not found]             ` <479C744A.6020207@abinetworks.biz>
     [not found]               ` <12964A18-350B-443F-B15A-D78B3723C89A@oracle.com>
     [not found]                 ` <479F2463.2040704@abinetworks.biz>
     [not found]                   ` <4AAA3DAF-898C-4ED5-BD07-4FD2B5CEEF16@oracle.com>
     [not found]                     ` <Pine.LNX.4.64.0801291851110.4820@maggie.lkpg.cendio.se>
     [not found]                       ` <7EE4B02B-3359-41C0-BFED-0947DF9F5F5A@oracle.com>
     [not found]                         ` <479F8377.6090704@abinetworks.biz>
     [not found]                           ` <1201638661.7969.7.camel@heimdal.trondhjem.org>
     [not found]                             ` <F7BFE847-A3C3-44E5-A238-5C69ED3EE1C4@oracle.com>
     [not found]                               ` <47A0704D.7080808@abinetworks.biz>
     [not found]                                 ` <AB6DF303-5C5B-453C-9992-DAF826FF02CE@oracle.com>
2008-02-06 18:25                                   ` Gianluca Alberici [this message]
2008-02-06 19:47                                     ` Chuck Lever
2008-02-06 21:55                                       ` Gianluca Alberici
2008-02-06 22:16                                         ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47A9FB91.2040304@abinetworks.biz \
    --to=gianluca@abinetworks.biz \
    --cc=akpm@linux-foundation.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --subject='Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).