LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Hugh Dickins <hugh@veritas.com>
To: Matt Helsley <matthltc@us.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>,
	Miklos Szeredi <mszeredi@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	William Lee Irwin III <wli@holomorphy.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org, linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH] sys_remap_file_pages: fix ->vm_file accounting
Date: Thu, 7 Feb 2008 16:40:54 +0000 (GMT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0802071546580.29324@blonde.site> (raw)
In-Reply-To: <1202343398.9062.253.camel@localhost.localdomain>

On Wed, 6 Feb 2008, Matt Helsley wrote:
> On Wed, 2008-02-06 at 20:33 +0000, Hugh Dickins wrote:
> > 
> > Sorry, Matt, I don't like your patch at all.  It seems to add a fair
> > amount of ugliness and unmaintainablity, all for a peculiar MVFS case
> 
> I thought that getting rid of the separate versions of proc_exe_link()
> improved maintainability.

That's a plus, but I don't see it balances the minus.

> Do you have any specific details on what you
> think makes the code introduced by the patch unmaintainable?

The fact that you now have to shadow operations on vm_file by
operations on your exe_file, easy to get out of step; and that
we'll tend not to notice when it goes wrong, because the common
case will have them both in the root mount (that will hide busy
errors, won't it? or am I mistaken?).

Though I should concede that your latest patch looks like it catches
all the places it needs to, and that they don't really stretch beyond
mm/mmap.c.  If you provided functions to do the get_file/fput/VM_EXE-
CUTABLE stuff, most of the ugliness would be confined to fs/proc/base.c.

I much preferred Andrew's take-a-copy-of-the-pathname approach, but
admit I'm not one to appreciate the namespace arguments against it.
I wish I had a more constructive solution.

> I still think it would help any stacking filesystems that can't use the
> solution adopted by unionfs.

If you think it's going to help lots of others, please get them to
speak up in support.  But we're rather short of stacking filesystems
in the kernel at present, so it can be hard to argue.

> > And I found it quite hard to see where the crucial difference comes.
> > I guess it's that MVFS changes vma->vm_file in its ->mmap?  Well, if
> 
> Yup.
> 
> > MVFS does that, maybe something else does that too, but precisely to
> > rely on the present behaviour of /proc/pid/exe - so in fixing for
> > MVFS, we'd be breaking that hypothetical other?
> 
> 	I'm not completely certain that I understand your point. Are you
> suggesting that some hypothetical code would want to use this "quirk"
> of /proc/pid/exe for a legitimate purpose?

Yes, but our different viewpoints give us a different idea of what's
a quirk.  The quirk I see is that MVFS is fiddling with vma->vm_file
in its ->mmap, which is rather unusual (though not unique); and then
complaining when this doesn't work as it wants.  But you see it as a
quirk that the VM_EXECUTABLE's vm_file gets used for /proc/pid/exe?

> 	Assuming that is your point, I thought my non-hypothetical java example
> clearly demonstrated that at least one non-hypothetical program doesn't
> expect the "quirk" and breaks because of it.

Yes, and I'm suggesting that if we change the behaviour to suit you,
we might be causing a regression in something else.  No evidence for
that, but it's a possibility (though probably not one I'd be arguing,
if I actually liked the patch involved).

When and if the VFS provides integrated support for stacking filesystems,
it would make sense to do something about this.  But as things stand, it's
for the stacking filesystem to manage its stack.  Why uglify the kernel
code (which doesn't include any user for this), instead of managing that
stacking yourself i.e. why not leave vm_file as is (or if necessary
point it to a proxy), and use its private_data for your optimizations?

> Frankly,
> given /proc/pid/exe's output in the non-stacking case, I can't see how
> its output in the stacking case we're discussing could be considered
> anything but buggy.

But whose bug?

Hugh

  reply	other threads:[~2008-02-07 16:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-30 14:20 Oleg Nesterov
2008-01-30 16:55 ` Miklos Szeredi
2008-01-30 17:26   ` Oleg Nesterov
2008-02-02 20:52     ` Matt Helsley
2008-02-02 21:17     ` Matt Helsley
2008-02-03 18:21       ` Oleg Nesterov
2008-02-06 20:33         ` Hugh Dickins
2008-02-07  0:16           ` Matt Helsley
2008-02-07 16:40             ` Hugh Dickins [this message]
2008-02-03 18:29     ` Oleg Nesterov
2008-02-06 20:13       ` Hugh Dickins
2008-02-11 10:15         ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0802071546580.29324@blonde.site \
    --to=hugh@veritas.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthltc@us.ibm.com \
    --cc=mingo@elte.hu \
    --cc=mszeredi@suse.cz \
    --cc=nickpiggin@yahoo.com.au \
    --cc=oleg@tv-sign.ru \
    --cc=wli@holomorphy.com \
    --subject='Re: [PATCH] sys_remap_file_pages: fix ->vm_file accounting' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).