LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH/RFC] Lustre VFS patch
@ 2004-05-24 11:39 Peter J. Braam
  2004-05-24 11:46 ` Jens Axboe
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Peter J. Braam @ 2004-05-24 11:39 UTC (permalink / raw)
  To: torvalds, akpm, linux-kernel; +Cc: 'Phil Schwan'

[-- Attachment #1: Type: text/plain, Size: 5392 bytes --]

Hi Linus, Andrew & others,

Many people have asked me to have this patched reviewed, and
considered for inclusion in 2.6 after some reworking.  At the kernel
summit last summer we discussed these issues.

This patch is not the Lustre file system, with client file system,
Lustre RPC, server code etc.  This patch would enable people to run
Lustre with modules loading into unmodified kernels and this appears
to be something vendors really would like to see.  FYI, at the moment
Lustre is a just a little bit larger than NFS, comparing clients,
servers, and rpc code for each (60K vs 80K lines of code).

This patch was written quite defensively: file systems not using 
intents should really be completely unaffected.

I attach a little tar ball with patches and a series file which
describes the order of the patches.  Below is a short description of
each of the patches.  I also want to tell you a few things we have 
worked on that are not present in this patch:
 
 - deal with "pinning" directories that are CWD or mount points.  They
   should not be removable/unlinkable by remote nodes, and the FS needs
   some indication of that.

 - in Lustre d_parent is not necessarily valid, so sys_getcwd needs to
   be careful.  In Ottawa we discussed adding an optional hook.

 - changes to NFSD to allow it to run on top of Lustre

 - per file system locking operations (refinining i_sem) to allow 
   parallel updates in one directory. 

It goes without saying that many people in CFS have made contributions
to this patch and that the ideas were heavily influenced by discussions
with the kernel community.

Please give me your thoughts about how we can move forward with this.
 
Regards,

- Peter -
 
 
dev_read_only-vanilla-2.6.patch
 
  This introduces an ioctl on block devices to stop doing I/O. The
  only purpose of the patch is automated recovery regression testing,
  it is very convenient to have this available.

export-vanilla-2.6.patch
 
  Export symbols used by Lustre.

header_guards-vanilla-2.6.patch
 
  Small bug fix to avoid double inclusion of headers

lustre_version.patch
 
  A tiny header to check that the kernel patch and Lustre modules are
  compatible.
  
vfs-dcache_locking-vanilla-2.6.patch

  A trivial patch to make functions available to lustre that do
  d_move, d_rehash, without taking/dropping the dcache lock.
 
vfs-dcache_lustre_invalid-vanilla-2.6.patch
 
  This allows dentries to be d_invalidate'd if a bit is set in the
  FLAGS.  This is required when dentries that are busy on the local
  node are invalidated on another client system.

vfs-intent_api-vanilla-2.6.patch
 
  Introduce intents for other operations.  Add a file system hook to
  release intent data.  Make a few "intent versions" of functions such
  as "lookup_one_len_it" and "user_walk_it" available through headers.
  Arrange that the open intent is visible in the open methods. Add a
  few missing intent_init calls.

vfs-intent_exec-vanilla-2.6.patch
 
  Add intents for open in the context of execution.
 
vfs-intent_init-vanilla-2.6.patch
 
  Add intent_init for all operations, not just for open.

vfs-intent_lustre-vanilla-2.6.patch
 
  Add a pointer to per fs intent_data structure to the struct lookup_intent.
 
vfs-raw_ops-vanilla-2.6.patch

  This adds raw operations for setattr, mkdir, rmdir, mknod, unlink,
  symlink, link and rename.  The raw operations look up the parent
  directories (but not leaf nodes) involved in the operation and then
  ask the file system to execute the operation.  These methods allow
  us to delegate the execution of these functions to the server, and
  instantiate no dentries for leaf nodes, leaf nodes will only enter
  the dcache on subsequent lookups.  This patch dramatically
  simplifies the client/server lock management, particularly for
  rename.
 
  In Ottawa Linus suggested that we could maybe do this with intents
  instead.  I feel that both are ugly, both are possible but intents
  looked akward.
 
vfs-revalidate_counter-vanilla-2.6.patch
 
  We found that dentries could be invalidated multiple times through
  activity on remote nodes, but this shoudl not lead to failure in
  Lustre.  The revalidation path real_lookup was adapted to take a few
  more iterations through the revalidation before giving up.  This
  revalidation takes the form of first revalidating, then looking up
  if it fails, and possibly revalidating a few more times (see above)
  until there is success.
 
vfs-revalidate_special-vanilla-2.6.patch

 
  We add nameidata flags to indicate that a link sits in the middle of
  the path and a flag that inidicates we are looking at the last
  component of a path.  [If someone knows if the existing flags could
  detect this, that would be welcome.]  To pass intents when the last
  pathname component is a "." we insert a "special" revalidation
  function that calls revalidate dentry when such a pathname is
  traversed.
 
vfs_intent-flags_rename-vanilla-2.6.patch
 
  As Linus requested in Ottawa last summer, rename intent.open.flags
  to intent.  it_flags and intent.open.create_mode to
  intent.it_create_mode.  Remove the union of "intents", which only
  contained an open_intent, and replace it with a single struct
  lookup_intent included in the nameidata.
 
vfs-do_truncate.patch
  
  We added a parameter to do_truncate so that the FS can now if it is
  being called from open or not.


[-- Attachment #2: linux-2.6.6-v1.0-lustre.tar.gz --]
[-- Type: application/x-gzip, Size: 14512 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
@ 2004-05-24 11:46 ` Jens Axboe
  2004-05-25  1:48   ` braam
  2004-05-24 12:00 ` hch
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Jens Axboe @ 2004-05-24 11:46 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

On Mon, May 24 2004, Peter J. Braam wrote:
> dev_read_only-vanilla-2.6.patch
>  
>   This introduces an ioctl on block devices to stop doing I/O. The
>   only purpose of the patch is automated recovery regression testing,
>   it is very convenient to have this available.

You still keep pushing this on, without having clarified why you can't
just use the genhd functions for this instead of adding some array of
block_devices. And while you fixed the bio->bi_rw check, you still don't
indicate error when you call bio_endio() for this case. So submitter of
that io thinks it completes successfully, while you just tossed it away.
Irk.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
  2004-05-24 11:46 ` Jens Axboe
@ 2004-05-24 12:00 ` hch
  2004-05-24 12:01 ` hch
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: hch @ 2004-05-24 12:00 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

Any chance you could please send one mail per patch the next time?
That makes reviewing them a lot easier.

> export-vanilla-2.6.patch
>  
>   Export symbols used by Lustre.

 	inodes_stat.nr_unused--;
 }
 
+EXPORT_SYMBOL(__iget);

	Explanation please, look completely bogus.

+EXPORT_SYMBOL(do_kern_mount);

	Explanation please, while not completely bogus probably
	bogus in this context.

+EXPORT_SYMBOL(truncate_complete_page);

	Dito, this looks completely bogus to.

 EXPORT_SYMBOL(kallsyms_lookup);
 EXPORT_SYMBOL(__print_symbol);
+EXPORT_SYMBOL(kernel_text_address);

	no way

+EXPORT_SYMBOL(reparent_to_init);

	bogus.  All your kernel threads should call daemonize()

 
+EXPORT_SYMBOL(exit_files);

	dito.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
  2004-05-24 11:46 ` Jens Axboe
  2004-05-24 12:00 ` hch
@ 2004-05-24 12:01 ` hch
  2004-05-24 12:03 ` hch
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: hch @ 2004-05-24 12:01 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

On Mon, May 24, 2004 at 07:39:50PM +0800, Peter J. Braam wrote:
> lustre_version.patch
>  
>   A tiny header to check that the kernel patch and Lustre modules are
>   compatible.

	bogus.  check KERNEL_VERSION if you want this merged.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
                   ` (2 preceding siblings ...)
  2004-05-24 12:01 ` hch
@ 2004-05-24 12:03 ` hch
  2004-05-24 15:33   ` Horst von Brand
  2004-05-24 12:05 ` hch
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: hch @ 2004-05-24 12:03 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

> vfs-dcache_locking-vanilla-2.6.patch
> 
>   A trivial patch to make functions available to lustre that do
>   d_move, d_rehash, without taking/dropping the dcache lock.


-void d_rehash(struct dentry * entry)
+void __d_rehash(struct dentry * entry)
 {
 	struct hlist_head *list = d_hash(entry->d_parent, entry->d_name.hash);
-	spin_lock(&dcache_lock);
  	entry->d_vfs_flags &= ~DCACHE_UNHASHED;
 	entry->d_bucket = list;
  	hlist_add_head_rcu(&entry->d_hash, list);
+}
+
+EXPORT_SYMBOL(__d_rehash);

looks okay as change but explanation is missing.

+EXPORT_SYMBOL(__d_move);
+
+void d_move(struct dentry *dentry, struct dentry *target)
+{
+	spin_lock(&dcache_lock);
+	__d_move(dentry, target);
 	spin_unlock(&dcache_lock);
 }

dito.
 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
                   ` (3 preceding siblings ...)
  2004-05-24 12:03 ` hch
@ 2004-05-24 12:05 ` hch
  2004-05-24 18:06   ` Trond Myklebust
  2004-05-24 12:08 ` hch
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: hch @ 2004-05-24 12:05 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

> vfs-intent_api-vanilla-2.6.patch
>  
>   Introduce intents for other operations.  Add a file system hook to
>   release intent data.  Make a few "intent versions" of functions such
>   as "lookup_one_len_it" and "user_walk_it" available through headers.
>   Arrange that the open intent is visible in the open methods. Add a
>   few missing intent_init calls.

I can't comment on the exact change, you need to talk to trond about
these.  But as-is they change the API exported to filesystems and thus
it's and absolute no-go for 2.6.   Where have you been when Trond's
intent patches went in?  Hiding under a rock?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
                   ` (4 preceding siblings ...)
  2004-05-24 12:05 ` hch
@ 2004-05-24 12:08 ` hch
  2004-05-24 13:44   ` Arjan van de Ven
  2004-05-24 14:19 ` viro
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: hch @ 2004-05-24 12:08 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

> vfs-raw_ops-vanilla-2.6.patch
> 
>   This adds raw operations for setattr, mkdir, rmdir, mknod, unlink,
>   symlink, link and rename.  The raw operations look up the parent
>   directories (but not leaf nodes) involved in the operation and then
>   ask the file system to execute the operation.  These methods allow
>   us to delegate the execution of these functions to the server, and
>   instantiate no dentries for leaf nodes, leaf nodes will only enter
>   the dcache on subsequent lookups.  This patch dramatically
>   simplifies the client/server lock management, particularly for
>   rename.
>  
>   In Ottawa Linus suggested that we could maybe do this with intents
>   instead.  I feel that both are ugly, both are possible but intents
>   looked akward.


This is complete crap.  We don't want to methods for every namespace
operation.  Please try to work out a scheme that needs only one method
fitting both lustre and normal filesystems (I guess by passing struct
nameidata everywhere instead of just the dentry and allowing no instanciation
for special filesystems).  But this is major surgery and makes only sense
for 2.7.x and if you actually want to merge lustre (or another filesystem
makign use of it) into mainline.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 12:08 ` hch
@ 2004-05-24 13:44   ` Arjan van de Ven
  2004-05-24 13:53     ` viro
  2004-05-28 16:56     ` braam
  0 siblings, 2 replies; 26+ messages in thread
From: Arjan van de Ven @ 2004-05-24 13:44 UTC (permalink / raw)
  To: hch; +Cc: Peter J. Braam, torvalds, akpm, linux-kernel, 'Phil Schwan'

[-- Attachment #1: Type: text/plain, Size: 1084 bytes --]

On Mon, 2004-05-24 at 14:08, hch@infradead.org wrote:
> > vfs-raw_ops-vanilla-2.6.patch
> > 
> >   This adds raw operations for setattr, mkdir, rmdir, mknod, unlink,
> >   symlink, link and rename.  The raw operations look up the parent
> >   directories (but not leaf nodes) involved in the operation and then
> >   ask the file system to execute the operation.  These methods allow
> >   us to delegate the execution of these functions to the server, and
> >   instantiate no dentries for leaf nodes, leaf nodes will only enter
> >   the dcache on subsequent lookups.  This patch dramatically
> >   simplifies the client/server lock management, particularly for
> >   rename.
> >  
> >   In Ottawa Linus suggested that we could maybe do this with intents
> >   instead.  I feel that both are ugly, both are possible but intents
> >   looked akward.
> 
> 
> This is complete crap.  We don't want to methods for every namespace
> operation.  

fun question: how does it deal with say a rename that would span mounts
on the client but wouldn't on the server? :)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 13:44   ` Arjan van de Ven
@ 2004-05-24 13:53     ` viro
  2004-05-28 16:56     ` braam
  1 sibling, 0 replies; 26+ messages in thread
From: viro @ 2004-05-24 13:53 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: hch, Peter J. Braam, torvalds, akpm, linux-kernel, 'Phil Schwan'

On Mon, May 24, 2004 at 03:44:44PM +0200, Arjan van de Ven wrote:
> > This is complete crap.  We don't want to methods for every namespace
> > operation.  
> 
> fun question: how does it deal with say a rename that would span mounts
> on the client but wouldn't on the server? :)

Should not be allowed.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
                   ` (5 preceding siblings ...)
  2004-05-24 12:08 ` hch
@ 2004-05-24 14:19 ` viro
  2004-05-28 23:18 ` Maneesh Soni
  2004-05-29 17:53 ` Anton Blanchard
  8 siblings, 0 replies; 26+ messages in thread
From: viro @ 2004-05-24 14:19 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

On Mon, May 24, 2004 at 07:39:50PM +0800, Peter J. Braam wrote:
> vfs-intent_api-vanilla-2.6.patch
>  
>   Introduce intents for other operations.  Add a file system hook to
>   release intent data.  Make a few "intent versions" of functions such
>   as "lookup_one_len_it" and "user_walk_it" available through headers.
>   Arrange that the open intent is visible in the open methods. Add a
>   few missing intent_init calls.

Where is the code using it?  Without examples of use there's no way to
tell whether it's bogus or not.

As it is, it looks like massive hook-adding exercise with no clear goal.
The same goes for ..._raw variants of the methods - yes, something in
that direction would make sense; however, splitting the codepath on the
top level looks like a bloody bad idea.  _IF_ we get hooks of that sort,
the right thing to do is to provide helpers and make normal filesystems
use them, passing current foo_mknod et.al. as callbacks.  And in any
case, that needs discussing - it's not obvious that "let's take over
entire work past lookup of parents" is the best choice here.

Again, without examples of users for that stuff you are asking for blind
change of API I'm really not comfortable with.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch 
  2004-05-24 12:03 ` hch
@ 2004-05-24 15:33   ` Horst von Brand
  2004-05-25 20:43     ` hch
  0 siblings, 1 reply; 26+ messages in thread
From: Horst von Brand @ 2004-05-24 15:33 UTC (permalink / raw)
  To: hch
  Cc: Peter J. Braam, torvalds, akpm, linux-kernel,
	'Phil Schwan',
	vonbrand

hch@infradead.org said:
> > vfs-dcache_locking-vanilla-2.6.patch
> > 
> >   A trivial patch to make functions available to lustre that do
> >   d_move, d_rehash, without taking/dropping the dcache lock.
> 
> 
> -void d_rehash(struct dentry * entry)
> +void __d_rehash(struct dentry * entry)
>  {
>  	struct hlist_head *list = d_hash(entry->d_parent, entry->d_name.hash);
> -	spin_lock(&dcache_lock);
>   	entry->d_vfs_flags &= ~DCACHE_UNHASHED;
>  	entry->d_bucket = list;
>   	hlist_add_head_rcu(&entry->d_hash, list);
> +}

Won't you also need a non-__ version, perhaps like so:

   void d_rehash(struct dentry *entry)
   {
       spin_lock(&dcache_lock);
       __d_rehash(entry);
       spin_unlock(&dcache_lock);
   }
   EXPORT_SYMBOL(d_rehash);

?

Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>

(Surely such a triviality doesn't need this, but...)
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 12:05 ` hch
@ 2004-05-24 18:06   ` Trond Myklebust
  2004-05-25  8:21     ` braam
  0 siblings, 1 reply; 26+ messages in thread
From: Trond Myklebust @ 2004-05-24 18:06 UTC (permalink / raw)
  To: hch
  Cc: Peter J. Braam, Linus Torvalds, Andrew Morton, linux-kernel,
	'Phil Schwan'

På må , 24/05/2004 klokka 08:05, skreiv hch@infradead.org:
> I can't comment on the exact change, you need to talk to trond about
> these.  But as-is they change the API exported to filesystems and thus
> it's and absolute no-go for 2.6.   Where have you been when Trond's
> intent patches went in?  Hiding under a rock?

To be fair: At the time, Peter and I did indeed discuss the changes that
were needed in order to support both Lustre and NFS. The main reason why
I ended up sending in the "NFS minimal" patch was IIRC that Peter was
not ready at the time to send in a full Lustre client that could make
use of this interface.

Peter, I have a couple of objections here

        vfs_intent-flags_rename-vanilla-2.6.patch and
        vfs-intent_exec-vanilla-2.6.patch breaks NFS (though ironically
        it fixes CIFS) due to that gratuitous change of semantics from
        FMODE_READ/FMODE_WRITE to O_RDONLY/O_WRONLY/O_RDWR. Exactly why
        couldn't Lustre work with the native VFS semantics?
        
        vfs_intent-flags_rename-vanilla-2.6.patch also reverts the
        format from being a union of intents for various operations to
        being a single structure. This goes against what was agreed upon
        on linux-fsdevel when this issue was discussed last summer (in
        fact Linus was the one who requested the union approach). What
        justification exists now for a change?

        The vfs-intent_lustre-vanilla-2.6.patch + the "intent_release()"
        code. What if you end up crossing a mountpoint? How do you then
        know to which superblock/filesystem the private field belongs if
        there are more than one user of this mechanism?

        vfs-revalidate_counter-vanilla-2.6.patch: can't "counter" be put
        into the private part of your intent structure so that the whole
        test may be done within Lustre?

        vfs-revalidate_special-vanilla-2.6.patch: see the use of the
        flag FS_REVAL_DOT in order to enable the revalidation of the
        current directory for those filesystems that need to do so.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:46 ` Jens Axboe
@ 2004-05-25  1:48   ` braam
  2004-05-25  6:47     ` Jens Axboe
  0 siblings, 1 reply; 26+ messages in thread
From: braam @ 2004-05-25  1:48 UTC (permalink / raw)
  To: 'Jens Axboe'; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

Hi Jens,

We use this patch on servers as follows. 

Lustre servers give an immediate response for RPC's to clients, and later
indicate what transactions numbers have been committed to disk. At known
points in the execution we sync all transactions to disk, execute our ioctl.
When the ioctl is issues Lustre is also instructed not to send disk commit
confirmation to clients. Then the system continues to execute some
transactions, but only in memory, and send responses to clients.   We are
sure they are lost if we powercycle that system.  This enables tests for
replay of transactions by client nodes in the cluster.

If we were to return errors, (which, I agree, _seems_ much more sane, and we
_did_ try that for a while!) then there is a good chance, namely immediately
when something is flushed to disk, that the system will detect the errors
and not continue to execute transactions making consistent testing of our
replay mechanisms impossible.

I hope that this explains why we do not return errors.  Now if you tell me
that I can turn off I/O, and not get errors, with existing ioctls then I
certainly should existing ioctls. Can you clarify that.

Am I making sense to you now?

- Peter -


> -----Original Message-----
> From: Jens Axboe [mailto:axboe@suse.de] 
> Sent: Monday, May 24, 2004 7:47 PM
> To: Peter J. Braam
> Cc: torvalds@osdl.org; akpm@osdl.org; 
> linux-kernel@vger.kernel.org; 'Phil Schwan'
> Subject: Re: [PATCH/RFC] Lustre VFS patch
> 
> On Mon, May 24 2004, Peter J. Braam wrote:
> > dev_read_only-vanilla-2.6.patch
> >  
> >   This introduces an ioctl on block devices to stop doing I/O. The
> >   only purpose of the patch is automated recovery 
> regression testing,
> >   it is very convenient to have this available.
> 
> You still keep pushing this on, without having clarified why 
> you can't just use the genhd functions for this instead of 
> adding some array of block_devices. And while you fixed the 
> bio->bi_rw check, you still don't indicate error when you 
> call bio_endio() for this case. So submitter of that io 
> thinks it completes successfully, while you just tossed it away.
> Irk.
> 
> --
> Jens Axboe
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-25  1:48   ` braam
@ 2004-05-25  6:47     ` Jens Axboe
  2004-05-25  8:21       ` braam
  0 siblings, 1 reply; 26+ messages in thread
From: Jens Axboe @ 2004-05-25  6:47 UTC (permalink / raw)
  To: braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

On Tue, May 25 2004, braam wrote:
> Hi Jens,
> 
> We use this patch on servers as follows. 
> 
> Lustre servers give an immediate response for RPC's to clients, and
> later indicate what transactions numbers have been committed to disk.
> At known points in the execution we sync all transactions to disk,
> execute our ioctl.  When the ioctl is issues Lustre is also instructed
> not to send disk commit confirmation to clients. Then the system
> continues to execute some transactions, but only in memory, and send
> responses to clients.   We are sure they are lost if we powercycle
> that system.  This enables tests for replay of transactions by client
> nodes in the cluster.
> 
> If we were to return errors, (which, I agree, _seems_ much more sane,
> and we _did_ try that for a while!) then there is a good chance,
> namely immediately when something is flushed to disk, that the system
> will detect the errors and not continue to execute transactions making
> consistent testing of our replay mechanisms impossible.
> 
> I hope that this explains why we do not return errors.  Now if you
> tell me that I can turn off I/O, and not get errors, with existing
> ioctls then I certainly should existing ioctls. Can you clarify that.
> 
> Am I making sense to you now?

Not really, since you are not answering my question at all... My
question is not why you need this codeo or how you are using it, it's
why you cannot use existing functionality to do the same? Look at
genhd.c, it has functions for checking/marking/clearing read-only bit on
a block_device.

And if this it to make sense for inclusion, io _must_ be ended with
-EROFS or similar.

It seems to me that this probably belongs in your test harness for
debugging purposes. At least in its current state it's not acceptable
for inclusion.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH/RFC] Lustre VFS patch
  2004-05-24 18:06   ` Trond Myklebust
@ 2004-05-25  8:21     ` braam
  0 siblings, 0 replies; 26+ messages in thread
From: braam @ 2004-05-25  8:21 UTC (permalink / raw)
  To: 'Trond Myklebust', hch
  Cc: 'Linus Torvalds', 'Andrew Morton',
	linux-kernel, 'Phil Schwan',
	oleg

Hi Trond,

Thanks for your very helpful reply ... Here are a few comments. 

> Peter, I have a couple of objections here
> 
>         vfs_intent-flags_rename-vanilla-2.6.patch and
>         vfs-intent_exec-vanilla-2.6.patch breaks NFS (though 
> ironically
>         it fixes CIFS) due to that gratuitous change of semantics from
>         FMODE_READ/FMODE_WRITE to O_RDONLY/O_WRONLY/O_RDWR. 
> Exactly why
>         couldn't Lustre work with the native VFS semantics?

Our error (this patch was wrong) - you are right we should use FMODE_READ.

>         vfs_intent-flags_rename-vanilla-2.6.patch also reverts the
>         format from being a union of intents for various operations to
>         being a single structure. This goes against what was 
> agreed upon
>         on linux-fsdevel when this issue was discussed last summer (in
>         fact Linus was the one who requested the union approach). What
>         justification exists now for a change?

Justification: Linus asked me to.  He said, there is only intent data for
open really so please remove the union.  I don't particularly care.

>         The vfs-intent_lustre-vanilla-2.6.patch + the 
> "intent_release()"
>         code. What if you end up crossing a mountpoint? How 
> do you then
>         know to which superblock/filesystem the private field 
> belongs if
>         there are more than one user of this mechanism?

Well spotted.  This is only used during the last component of lookup, so we
don't much care about this as we travers intermediate path names.  Have I
missed something else here?

>         vfs-revalidate_counter-vanilla-2.6.patch: can't 
> "counter" be put
>         into the private part of your intent structure so 
> that the whole
>         test may be done within Lustre?

Clever, we will do that.

>         vfs-revalidate_special-vanilla-2.6.patch: see the use of the
>         flag FS_REVAL_DOT in order to enable the revalidation of the
>         current directory for those filesystems that need to do so.

Didn't know this flag was there, we will use FS_REVAL_DOT.

Oleg Drokin is reworking the patches for these issues.

Thanks, this simplifies things!

- Peter -


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH/RFC] Lustre VFS patch
  2004-05-25  6:47     ` Jens Axboe
@ 2004-05-25  8:21       ` braam
  2004-05-25  8:27         ` Jens Axboe
  2004-05-25 10:52         ` Lars Marowsky-Bree
  0 siblings, 2 replies; 26+ messages in thread
From: braam @ 2004-05-25  8:21 UTC (permalink / raw)
  To: 'Jens Axboe'; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

Jens,

I think do answer your question:  
...
> > If we were to return errors, (which, I agree, _seems_ much 
> more sane, 
> > and we _did_ try that for a while!) then there is a good chance, 
> > namely immediately when something is flushed to disk, that 
> the system 
> > will detect the errors and not continue to execute 
> transactions making 
> > consistent testing of our replay mechanisms impossible.

So: we can use the flags, but we cannot return the errors.

> And if this it to make sense for inclusion, io _must_ be 
> ended with -EROFS or similar.
> 
> It seems to me that this probably belongs in your test 
> harness for debugging purposes. At least in its current state 
> it's not acceptable for inclusion.

This is, as I mentioned, only for testing.  It is, clearly, NOT ordinary
system behavior at all since we don't, and won't, return the error. 

Some people find it very convenient to have this available, but if the
opinion is that it is better to let development teams manage their own
testing infrastructure that is acceptable to me.

- Peter -




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-25  8:21       ` braam
@ 2004-05-25  8:27         ` Jens Axboe
  2004-05-25 10:52         ` Lars Marowsky-Bree
  1 sibling, 0 replies; 26+ messages in thread
From: Jens Axboe @ 2004-05-25  8:27 UTC (permalink / raw)
  To: braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

On Tue, May 25 2004, braam wrote:
> Jens,
> 
> I think do answer your question:  
> ...
> > > If we were to return errors, (which, I agree, _seems_ much 
> > more sane, 
> > > and we _did_ try that for a while!) then there is a good chance, 
> > > namely immediately when something is flushed to disk, that 
> > the system 
> > > will detect the errors and not continue to execute 
> > transactions making 
> > > consistent testing of our replay mechanisms impossible.
> 
> So: we can use the flags, but we cannot return the errors.

The generic_make_request() change itself is fine, as long as the proper
error is propagated back. I don't object to that at all, and I outlined
that to Phil last week as well. So in short:

        if (bio_data_dir(bio) == WRITE && bdev_read_only(bio->bi_bdev)) {
                bio_endio(bio, bio->bi_size, -EROFS);
                break;
        }

If you want to pass back 0 instead, then that would be a one-liner in
your (private) debugging patch. Ok?

> > And if this it to make sense for inclusion, io _must_ be 
> > ended with -EROFS or similar.
> > 
> > It seems to me that this probably belongs in your test 
> > harness for debugging purposes. At least in its current state 
> > it's not acceptable for inclusion.
> 
> This is, as I mentioned, only for testing.  It is, clearly, NOT ordinary
> system behavior at all since we don't, and won't, return the error. 
> 
> Some people find it very convenient to have this available, but if the
> opinion is that it is better to let development teams manage their own
> testing infrastructure that is acceptable to me.

I don't think this change makes sense as written for the generic
kernels, not if you want to simply ignore the write. If that is the
case, it's a special case debug entry for a very narrow use (ie lustre).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-25  8:21       ` braam
  2004-05-25  8:27         ` Jens Axboe
@ 2004-05-25 10:52         ` Lars Marowsky-Bree
  2004-05-25 11:45           ` braam
  2004-05-25 13:35           ` Kevin Corry
  1 sibling, 2 replies; 26+ messages in thread
From: Lars Marowsky-Bree @ 2004-05-25 10:52 UTC (permalink / raw)
  To: braam, 'Jens Axboe'
  Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

On 2004-05-25T16:21:29,
   braam <braam@clusterfs.com> said:

> I think do answer your question:  ...
> > > If we were to return errors, (which, I agree, _seems_ much more
> > > sane, and we _did_ try that for a while!) then there is a good
> > > chance, namely immediately when something is flushed to disk, that
> > > the system will detect the errors and not continue to execute
> > > transactions making consistent testing of our replay mechanisms
> > > impossible.
> So: we can use the flags, but we cannot return the errors.

Maybe I am missing something here, but is this testing not somewhat
unrealistic then? In the general case, the system in production _will_
report an error and not silently throw away the writes.

> Some people find it very convenient to have this available, but if the
> opinion is that it is better to let development teams manage their own
> testing infrastructure that is acceptable to me.

Yes, this is very "convenient" and actually, "some people" think it is
absolutely mandatory that the kernel which is used for production sites
is 1:1 bit-wise identical than the one used for load & stress testing,
otherwise the testing is void to a certain degree...

Maybe you could fix this in the test harness / Lustre itself instead and
silently discard the writes internally if told so via an (internal)
option, instead of needing a change deeper down in the IO layer, or use
a DM target which can give you all the failure scenarios you need?

In particular the last one - a fault-injection DM target - seems like a
very valuable tool for testing in general, but the Lustre-internal
approach may be easier in the long run.


Sincerely,
    Lars Marowsky-Brée <lmb@suse.de>

-- 
High Availability & Clustering	      \ ever tried. ever failed. no matter.
SUSE Labs			      | try again. fail again. fail better.
Research & Development, SUSE LINUX AG \ 	-- Samuel Beckett


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH/RFC] Lustre VFS patch
  2004-05-25 10:52         ` Lars Marowsky-Bree
@ 2004-05-25 11:45           ` braam
  2004-05-25 13:35           ` Kevin Corry
  1 sibling, 0 replies; 26+ messages in thread
From: braam @ 2004-05-25 11:45 UTC (permalink / raw)
  To: 'Lars Marowsky-Bree', 'Jens Axboe'
  Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

Hi Lars, 

> -----Original Message-----
> From: Lars Marowsky-Bree [mailto:lmb@suse.de] 
> Sent: Tuesday, May 25, 2004 6:53 PM
> To: braam; 'Jens Axboe'
> Cc: torvalds@osdl.org; akpm@osdl.org; 
> linux-kernel@vger.kernel.org; 'Phil Schwan'
> Subject: Re: [PATCH/RFC] Lustre VFS patch
> 
> On 2004-05-25T16:21:29,
>    braam <braam@clusterfs.com> said:
> 
> > I think do answer your question:  ...
> > > > If we were to return errors, (which, I agree, _seems_ much more 
> > > > sane, and we _did_ try that for a while!) then there is a good 
> > > > chance, namely immediately when something is flushed to 
> disk, that 
> > > > the system will detect the errors and not continue to execute 
> > > > transactions making consistent testing of our replay mechanisms 
> > > > impossible.
> > So: we can use the flags, but we cannot return the errors.
> 
> Maybe I am missing something here, but is this testing not 
> somewhat unrealistic then? In the general case, the system in 
> production _will_ report an error and not silently throw away 
> the writes.

I would not say "unrealistic": It is a harsh way to systematically and
consistently generate failure patterns that are otherwise subject to winning
races with the flushing daemons. 

Semantically what we add a new flag: 
 IGNORE_IO_ERRORS
The ioctl in our patch has the same effect as setting IGNORE_IO_ERRORS |
RDONLY

Is it really terrible to have that flag?

> > Some people find it very convenient to have this available, 
> but if the 
> > opinion is that it is better to let development teams 
> manage their own 
> > testing infrastructure that is acceptable to me.
> 
> Yes, this is very "convenient" and actually, "some people" 
> think it is absolutely mandatory that the kernel which is 
> used for production sites is 1:1 bit-wise identical than the 
> one used for load & stress testing, otherwise the testing is 
> void to a certain degree...
> 
> Maybe you could fix this in the test harness / Lustre itself 
> instead and silently discard the writes internally if told so 
> via an (internal) option, instead of needing a change deeper 
> down in the IO layer, or use a DM target which can give you 
> all the failure scenarios you need?
> 
> In particular the last one - a fault-injection DM target - 
> seems like a very valuable tool for testing in general, but 
> the Lustre-internal approach may be easier in the long run.

Yes, a (virtual) block device can do this easily.  If Jens can accept the
new flag that is easiest, if not we will hack up a DM target in due course.

- Peter -


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-25 10:52         ` Lars Marowsky-Bree
  2004-05-25 11:45           ` braam
@ 2004-05-25 13:35           ` Kevin Corry
  2004-05-25 13:55             ` Jens Axboe
  1 sibling, 1 reply; 26+ messages in thread
From: Kevin Corry @ 2004-05-25 13:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lars Marowsky-Bree, braam, 'Jens Axboe',
	torvalds, akpm, 'Phil Schwan'

On Tuesday 25 May 2004 5:52 am, Lars Marowsky-Bree wrote:
> Maybe you could fix this in the test harness / Lustre itself instead and
> silently discard the writes internally if told so via an (internal)
> option, instead of needing a change deeper down in the IO layer, or use
> a DM target which can give you all the failure scenarios you need?
>
> In particular the last one - a fault-injection DM target - seems like a
> very valuable tool for testing in general, but the Lustre-internal
> approach may be easier in the long run.

See dm-flakey.c in the latest -udm patchset for a fairly simple version of a 
"fault-injection" target.

http://sources.redhat.com/dm/patches.html

-- 
Kevin Corry
kevcorry@us.ibm.com
http://evms.sourceforge.net/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-25 13:35           ` Kevin Corry
@ 2004-05-25 13:55             ` Jens Axboe
  0 siblings, 0 replies; 26+ messages in thread
From: Jens Axboe @ 2004-05-25 13:55 UTC (permalink / raw)
  To: Kevin Corry
  Cc: linux-kernel, Lars Marowsky-Bree, braam, torvalds, akpm,
	'Phil Schwan'

On Tue, May 25 2004, Kevin Corry wrote:
> On Tuesday 25 May 2004 5:52 am, Lars Marowsky-Bree wrote:
> > Maybe you could fix this in the test harness / Lustre itself instead and
> > silently discard the writes internally if told so via an (internal)
> > option, instead of needing a change deeper down in the IO layer, or use
> > a DM target which can give you all the failure scenarios you need?
> >
> > In particular the last one - a fault-injection DM target - seems like a
> > very valuable tool for testing in general, but the Lustre-internal
> > approach may be easier in the long run.
> 
> See dm-flakey.c in the latest -udm patchset for a fairly simple version of a 
> "fault-injection" target.
> 
> http://sources.redhat.com/dm/patches.html

Would by far be the superior solution.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 15:33   ` Horst von Brand
@ 2004-05-25 20:43     ` hch
  0 siblings, 0 replies; 26+ messages in thread
From: hch @ 2004-05-25 20:43 UTC (permalink / raw)
  To: Horst von Brand
  Cc: Peter J. Braam, torvalds, akpm, linux-kernel, 'Phil Schwan'

On Mon, May 24, 2004 at 11:33:49AM -0400, Horst von Brand wrote:
> Won't you also need a non-__ version, perhaps like so:
> 
>    void d_rehash(struct dentry *entry)
>    {
>        spin_lock(&dcache_lock);
>        __d_rehash(entry);
>        spin_unlock(&dcache_lock);
>    }
>    EXPORT_SYMBOL(d_rehash);
> 
> ?

Yes, and that is in fact included in their patch, I just didn't
quote it because it didn't seem relevant for the review.

One more reason why review requests for an url to a tarball of patches
are evil..


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH/RFC] Lustre VFS patch
  2004-05-24 13:44   ` Arjan van de Ven
  2004-05-24 13:53     ` viro
@ 2004-05-28 16:56     ` braam
  2004-05-28 17:00       ` Christoph Hellwig
  1 sibling, 1 reply; 26+ messages in thread
From: braam @ 2004-05-28 16:56 UTC (permalink / raw)
  To: arjanv, hch; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

Hi Arjan, 

> -----Original Message-----
> From: Arjan van de Ven [mailto:arjanv@redhat.com] 
> 
> fun question: how does it deal with say a rename that would 
> span mounts on the client but wouldn't on the server? :)


Mostly checks are done like in sys_rename.  

Some cases require new distributed state in the FS, such as the fact that a
certain directory is a mountpoint, possibly not on the node doing a rename,
but on another node.  

For this the Linux VFS has no api - we added something we call "pinning" for
this in 2.4, but not in 2.6 yet.

- Peter -


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-28 16:56     ` braam
@ 2004-05-28 17:00       ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2004-05-28 17:00 UTC (permalink / raw)
  To: braam; +Cc: arjanv, torvalds, akpm, linux-kernel, 'Phil Schwan'

On Sat, May 29, 2004 at 12:56:40AM +0800, braam wrote:
> Mostly checks are done like in sys_rename.  
> 
> Some cases require new distributed state in the FS, such as the fact that a
> certain directory is a mountpoint, possibly not on the node doing a rename,
> but on another node.  
> 
> For this the Linux VFS has no api - we added something we call "pinning" for
> this in 2.4, but not in 2.6 yet.

In general I'd be happier to see code like that residing in the VFS,
especially as I guess other filesystems like AFS would like to have similar
features.In general I'd be happier to see code like that residing in the VFS,
especially as I guess other filesystems like AFS would like to have similar
features.

Where's the current lustre code that sits behind those interfaces?  Do you
have a patch that adds the lustre client to the kernel instead of the huge
cvs repository containing all kinds of unrelated code ala the obsolete
lustre 1.0?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
                   ` (6 preceding siblings ...)
  2004-05-24 14:19 ` viro
@ 2004-05-28 23:18 ` Maneesh Soni
  2004-05-29 17:53 ` Anton Blanchard
  8 siblings, 0 replies; 26+ messages in thread
From: Maneesh Soni @ 2004-05-28 23:18 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'

Hi Peter,

I have some questions about the revalidate_special.

In case of "." we do revalidate_special() which calls real_lookup(), 
which can create a negative dentry, if "." is unhashed, and return 
no error. After that we don;t have any check for negative dentry
in link_path_walk. The result could be we return success
from link_path_walk() with a negative dentry. And I think it is
bad.

Maneesh





On Mon, May 24, 2004 at 07:39:50PM +0800, Peter J. Braam wrote:
> Hi Linus, Andrew & others,
> 
> Many people have asked me to have this patched reviewed, and
> considered for inclusion in 2.6 after some reworking.  At the kernel
> summit last summer we discussed these issues.
> 
> This patch is not the Lustre file system, with client file system,
> Lustre RPC, server code etc.  This patch would enable people to run
> Lustre with modules loading into unmodified kernels and this appears
> to be something vendors really would like to see.  FYI, at the moment
> Lustre is a just a little bit larger than NFS, comparing clients,
> servers, and rpc code for each (60K vs 80K lines of code).
> 
> This patch was written quite defensively: file systems not using 
> intents should really be completely unaffected.
> 
> I attach a little tar ball with patches and a series file which
> describes the order of the patches.  Below is a short description of
> each of the patches.  I also want to tell you a few things we have 
> worked on that are not present in this patch:
>  
>  - deal with "pinning" directories that are CWD or mount points.  They
>    should not be removable/unlinkable by remote nodes, and the FS needs
>    some indication of that.
> 
>  - in Lustre d_parent is not necessarily valid, so sys_getcwd needs to
>    be careful.  In Ottawa we discussed adding an optional hook.
> 
>  - changes to NFSD to allow it to run on top of Lustre
> 
>  - per file system locking operations (refinining i_sem) to allow 
>    parallel updates in one directory. 
> 
> It goes without saying that many people in CFS have made contributions
> to this patch and that the ideas were heavily influenced by discussions
> with the kernel community.
> 
> Please give me your thoughts about how we can move forward with this.
>  
> Regards,
> 
> - Peter -
>  
>  
> dev_read_only-vanilla-2.6.patch
>  
>   This introduces an ioctl on block devices to stop doing I/O. The
>   only purpose of the patch is automated recovery regression testing,
>   it is very convenient to have this available.
> 
> export-vanilla-2.6.patch
>  
>   Export symbols used by Lustre.
> 
> header_guards-vanilla-2.6.patch
>  
>   Small bug fix to avoid double inclusion of headers
> 
> lustre_version.patch
>  
>   A tiny header to check that the kernel patch and Lustre modules are
>   compatible.
>   
> vfs-dcache_locking-vanilla-2.6.patch
> 
>   A trivial patch to make functions available to lustre that do
>   d_move, d_rehash, without taking/dropping the dcache lock.
>  
> vfs-dcache_lustre_invalid-vanilla-2.6.patch
>  
>   This allows dentries to be d_invalidate'd if a bit is set in the
>   FLAGS.  This is required when dentries that are busy on the local
>   node are invalidated on another client system.
> 
> vfs-intent_api-vanilla-2.6.patch
>  
>   Introduce intents for other operations.  Add a file system hook to
>   release intent data.  Make a few "intent versions" of functions such
>   as "lookup_one_len_it" and "user_walk_it" available through headers.
>   Arrange that the open intent is visible in the open methods. Add a
>   few missing intent_init calls.
> 
> vfs-intent_exec-vanilla-2.6.patch
>  
>   Add intents for open in the context of execution.
>  
> vfs-intent_init-vanilla-2.6.patch
>  
>   Add intent_init for all operations, not just for open.
> 
> vfs-intent_lustre-vanilla-2.6.patch
>  
>   Add a pointer to per fs intent_data structure to the struct lookup_intent.
>  
> vfs-raw_ops-vanilla-2.6.patch
> 
>   This adds raw operations for setattr, mkdir, rmdir, mknod, unlink,
>   symlink, link and rename.  The raw operations look up the parent
>   directories (but not leaf nodes) involved in the operation and then
>   ask the file system to execute the operation.  These methods allow
>   us to delegate the execution of these functions to the server, and
>   instantiate no dentries for leaf nodes, leaf nodes will only enter
>   the dcache on subsequent lookups.  This patch dramatically
>   simplifies the client/server lock management, particularly for
>   rename.
>  
>   In Ottawa Linus suggested that we could maybe do this with intents
>   instead.  I feel that both are ugly, both are possible but intents
>   looked akward.
>  
> vfs-revalidate_counter-vanilla-2.6.patch
>  
>   We found that dentries could be invalidated multiple times through
>   activity on remote nodes, but this shoudl not lead to failure in
>   Lustre.  The revalidation path real_lookup was adapted to take a few
>   more iterations through the revalidation before giving up.  This
>   revalidation takes the form of first revalidating, then looking up
>   if it fails, and possibly revalidating a few more times (see above)
>   until there is success.
>  
> vfs-revalidate_special-vanilla-2.6.patch
> 
>  
>   We add nameidata flags to indicate that a link sits in the middle of
>   the path and a flag that inidicates we are looking at the last
>   component of a path.  [If someone knows if the existing flags could
>   detect this, that would be welcome.]  To pass intents when the last
>   pathname component is a "." we insert a "special" revalidation
>   function that calls revalidate dentry when such a pathname is
>   traversed.
>  
> vfs_intent-flags_rename-vanilla-2.6.patch
>  
>   As Linus requested in Ottawa last summer, rename intent.open.flags
>   to intent.  it_flags and intent.open.create_mode to
>   intent.it_create_mode.  Remove the union of "intents", which only
>   contained an open_intent, and replace it with a single struct
>   lookup_intent included in the nameidata.
>  
> vfs-do_truncate.patch
>   
>   We added a parameter to do_truncate so that the FS can now if it is
>   being called from open or not.
> 



-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-25044999 Fax: 91-80-25268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH/RFC] Lustre VFS patch
  2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
                   ` (7 preceding siblings ...)
  2004-05-28 23:18 ` Maneesh Soni
@ 2004-05-29 17:53 ` Anton Blanchard
  8 siblings, 0 replies; 26+ messages in thread
From: Anton Blanchard @ 2004-05-29 17:53 UTC (permalink / raw)
  To: Peter J. Braam; +Cc: torvalds, akpm, linux-kernel, 'Phil Schwan'


Hi Peter,

> This patch is not the Lustre file system, with client file system,
> Lustre RPC, server code etc.  This patch would enable people to run
> Lustre with modules loading into unmodified kernels and this appears
> to be something vendors really would like to see.  FYI, at the moment
> Lustre is a just a little bit larger than NFS, comparing clients,
> servers, and rpc code for each (60K vs 80K lines of code).

Seems to me we really need to look at this in tandem with the filesystem
itself. Do you have a URL we can grab it from?

Have these patches undergone any siginifant test?

Anton

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2004-05-29 17:56 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-24 11:39 [PATCH/RFC] Lustre VFS patch Peter J. Braam
2004-05-24 11:46 ` Jens Axboe
2004-05-25  1:48   ` braam
2004-05-25  6:47     ` Jens Axboe
2004-05-25  8:21       ` braam
2004-05-25  8:27         ` Jens Axboe
2004-05-25 10:52         ` Lars Marowsky-Bree
2004-05-25 11:45           ` braam
2004-05-25 13:35           ` Kevin Corry
2004-05-25 13:55             ` Jens Axboe
2004-05-24 12:00 ` hch
2004-05-24 12:01 ` hch
2004-05-24 12:03 ` hch
2004-05-24 15:33   ` Horst von Brand
2004-05-25 20:43     ` hch
2004-05-24 12:05 ` hch
2004-05-24 18:06   ` Trond Myklebust
2004-05-25  8:21     ` braam
2004-05-24 12:08 ` hch
2004-05-24 13:44   ` Arjan van de Ven
2004-05-24 13:53     ` viro
2004-05-28 16:56     ` braam
2004-05-28 17:00       ` Christoph Hellwig
2004-05-24 14:19 ` viro
2004-05-28 23:18 ` Maneesh Soni
2004-05-29 17:53 ` Anton Blanchard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).