LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Michael Kerrisk" <mtk.manpages@googlemail.com>
To: "Davide Libenzi" <davidel@xmailserver.org>
Cc: "Chris \"¥¯\" Heath" <chris@heathens.co.nz>,
	"David Schwartz" <davids@webmaster.com>,
	dada1@cosmosbay.com,
	"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>,
	linux-man@vger.kernel.org
Subject: Re: epoll design problems with common fork/exec patterns
Date: Fri, 29 Feb 2008 20:54:53 +0100	[thread overview]
Message-ID: <cfd18e0f0802291154j612b4238i66cb8323b72814ca@mail.gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0802291111320.31028@alien.or.mcafeemobile.com>

On Fri, Feb 29, 2008 at 8:19 PM, Davide Libenzi <davidel@xmailserver.org> wrote:
> On Fri, 29 Feb 2008, Michael Kerrisk wrote:
>
> > As I think is clear, I've only given it very limited thought ;-).
> >
> > The point is that the existing implementation actually supports
> > "different *processes* sharing a single epoll fd and doing
> > epoll_wait() over it", but the semantics are unintuitive.  It may be
> > that the existing implementation was the best way of doing things.
> > But when I see the strange corner cases in the semantics, I can't help
> > but wonder (way too late), whether there might have been some other
> > way of implementing things that led to more intuitive semantics.
>
> Oh boy. The fact that you can have an epoll fd cross the fork boundary,
> does not mean that any indiscriminate use of it leads to sane results:

I ddidn't mean that it did.  Certainly in the current implementation
it there will insane situations ;-).

>        efd = epoll_create();
>        fork();
>                                        pipe(fds);
>                                        epoll_ctl(efd, ADD, fds[0]);
>        epoll_wait(); ????
>        ...
>        pipe(fds);
>        epoll_ctl(efd, ADD, fds[0]);
>                                        epoll_wait(); ????
>
>
> It is *NOT* a matter of semantics.

Of course -- but I don't think I suggested that I disagree on this.

> > >  If the next question is "But then why we made the epoll fd inheritable?",
> > >  the answer is, because it makes sense in many cases for a parent to hand
> > >  over an fd set to a child.
> >
> > Fair enough.
> >
> > So here's an idea about how things might alternatively have been done:
> >
> > a) The key for epoll entries could have been [file *, fd, PID]
> >
> > b) an epoll_wait() only returns events for fds where the PID maps that
> > of the caller.
> >
> > c) a close of a file descriptor removes the corresponding  [file *,
> > fd, PID] from the epoll set.
> >
> > d) when a fork() is done, then the epoll set has a new set of keys
> > added.  These are duplicates of the  [file *, fd, PID] entries for the
> > parent, but with the PID of the child substituted into the new keys.
> > Say the parent had PID 1000, and the child has PID 2000.  If the epoll
> > set initially contained:
> >
> > [X, 3, 1000]
> > [Y, 4, 1000]
> >
> > then after fork() we'd have:
> >
> > [X, 3, 1000]
> > [Y, 4, 1000]
> > [X, 3, 2000]
> > [Y, 4, 2000]
> >
> > There is of course room for debate about the efficiency of this
> > approach, I suppose.
>
> There sure is :)

Okay -- but I suspect it could have been made fairly efficient.

> > You said elsewhere:
> >
> > [[
> > That'd mean placing an eventpoll custom hook into sys_close(). Looks very
> > bad to me, and probably will look even worse to other kernel folks.
> > Is not much a performance issue (a check to see if a file* is an eventpoll
> > file is as easy as comparing the f_op pointer), but a design/style issue.
> > ]]
> >
> > But that wasn't very clear to me actually.  I note that filp_close()
> > already has special case handling for dnotify (R.I.P.) and fcntl()
> > )aka POSIX) file locks, so there was already precedent for a custom
> > hook, AFAICS, and epoll is at least as worthy of special treatment as
> > either of those cases.
>
> I guess that over the time, Al became software WRT junk going there :)

Sorry -- I don't understand that last sentence?

  reply	other threads:[~2008-02-29 19:55 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-27  6:22 Marc Lehmann
2007-10-27  8:23 ` Eric Dumazet
2007-10-27  8:51   ` Marc Lehmann
2007-10-27  9:22     ` Eric Dumazet
2007-10-27  9:34       ` Marc Lehmann
2007-10-27 10:23         ` Eric Dumazet
2007-10-27 10:46           ` Marc Lehmann
2007-10-27 16:59     ` Davide Libenzi
2007-10-27 17:38       ` Willy Tarreau
2007-10-27 18:01         ` Davide Libenzi
2007-10-29 22:36         ` Mark Lord
2007-10-28  4:47       ` David Schwartz
2007-10-28  9:33         ` Eric Dumazet
2007-10-28 21:04           ` David Schwartz
2007-10-29 18:55             ` Davide Libenzi
2008-02-26 15:13               ` Michael Kerrisk
2008-02-26 18:51                 ` Davide Libenzi
2008-02-27  1:30                   ` Chris "ク" Heath
2008-02-27 19:35                     ` Davide Libenzi
2008-02-28 13:12                       ` Michael Kerrisk
2008-02-28 13:23                         ` Michael Kerrisk
2008-02-28 19:34                           ` Davide Libenzi
2008-02-28 19:23                         ` Davide Libenzi
2008-02-29 15:46                           ` Michael Kerrisk
2008-02-29 19:19                             ` Davide Libenzi
2008-02-29 19:54                               ` Michael Kerrisk [this message]
2008-03-02 15:11                                 ` Sam Varshavchik
2008-03-02 21:44                                   ` Davide Libenzi
2007-10-28 18:48         ` Davide Libenzi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cfd18e0f0802291154j612b4238i66cb8323b72814ca@mail.gmail.com \
    --to=mtk.manpages@googlemail.com \
    --cc=chris@heathens.co.nz \
    --cc=dada1@cosmosbay.com \
    --cc=davidel@xmailserver.org \
    --cc=davids@webmaster.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --subject='Re: epoll design problems with common fork/exec patterns' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).