LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: epoll and shared fd's
       [not found]           ` <a19Lb-1F0-11@gated-at.bofh.it>
@ 2008-02-26 18:16             ` Bodo Eggert
  2008-02-28 12:10               ` Michael Kerrisk
  2008-02-28 13:53               ` Valdis.Kletnieks
  0 siblings, 2 replies; 16+ messages in thread
From: Bodo Eggert @ 2008-02-26 18:16 UTC (permalink / raw)
  To: Michael Kerrisk, Davide Libenzi, Pierre Habouzit, lkml,
	Eric Dumazet, Marc Lehmann, David Schwartz

Michael Kerrisk <mtk.manpages@googlemail.com> wrote:

> a) I did a
> 
> s/internal kernel handle/open file description/
> 
> since that is the POSIX term for the internal handle.
> 
> b) It seems to me that you text doesn't quite make the point explicit
> enough.  I've tried to rewrite it; could you please check:
> 
>        A6     Yes, but be aware of the following point.  A  file
>               descriptor is a reference to an open file descrip-
>               tion (see  open(2)).   Whenever  a  descriptor  is
>               duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
>               or fork(2), a new file descriptor referring to the
>               same  open  file  description is created.  An open
>               file description continues to exist until all file
>               descriptors referring to it have been closed.  The
>               epoll  interface  automatically  removes  a   file
>               descriptor  from  an  epoll set only after all the
>               file descriptors referring to the underlying  open
>               file  handle  have  been  closed.  This means that
>               even after a file descriptor that is  part  of  an
>               epoll  set has been closed, events may be reported
>               for that file descriptor if other file descriptors
>               referring  to the same underlying file description
>               remain open.
> 
> Does that seem okay?  I plan to include the text in man-pages-2.79.

It's hard to read for me, and probably very hard to read for others.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-26 18:16             ` epoll and shared fd's Bodo Eggert
@ 2008-02-28 12:10               ` Michael Kerrisk
  2008-02-28 19:17                 ` Bodo Eggert
  2008-02-28 13:53               ` Valdis.Kletnieks
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Kerrisk @ 2008-02-28 12:10 UTC (permalink / raw)
  To: 7eggert
  Cc: Davide Libenzi, Pierre Habouzit, lkml, Eric Dumazet,
	Marc Lehmann, David Schwartz

On Tue, Feb 26, 2008 at 7:16 PM, Bodo Eggert <7eggert@gmx.de> wrote:
> Michael Kerrisk <mtk.manpages@googlemail.com> wrote:
>
>  > a) I did a
>  >
>  > s/internal kernel handle/open file description/
>  >
>  > since that is the POSIX term for the internal handle.
>  >
>  > b) It seems to me that you text doesn't quite make the point explicit
>  > enough.  I've tried to rewrite it; could you please check:
>  >
>  >        A6     Yes, but be aware of the following point.  A  file
>  >               descriptor is a reference to an open file descrip-
>  >               tion (see  open(2)).   Whenever  a  descriptor  is
>  >               duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
>  >               or fork(2), a new file descriptor referring to the
>  >               same  open  file  description is created.  An open
>  >               file description continues to exist until all file
>  >               descriptors referring to it have been closed.  The
>  >               epoll  interface  automatically  removes  a   file
>  >               descriptor  from  an  epoll set only after all the
>  >               file descriptors referring to the underlying  open
>  >               file  handle  have  been  closed.  This means that
>  >               even after a file descriptor that is  part  of  an
>  >               epoll  set has been closed, events may be reported
>  >               for that file descriptor if other file descriptors
>  >               referring  to the same underlying file description
>  >               remain open.
>  >
>  > Does that seem okay?  I plan to include the text in man-pages-2.79.
>
>  It's hard to read for me, and probably very hard to read for others.

Bodo,

I'm just reviewing this text, trying to see if I can improve it.  At
the moment, I'm a little stuck.  can you say a little more about why
you find it hard to read?  that may help me improve it.

Cheers,

Michael


-- 
Michael Kerrisk
Maintainer of the Linux man-pages project
http://www.kernel.org/doc/man-pages/
Want to report a man-pages bug?  Look here:
http://www.kernel.org/doc/man-pages/reporting_bugs.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-26 18:16             ` epoll and shared fd's Bodo Eggert
  2008-02-28 12:10               ` Michael Kerrisk
@ 2008-02-28 13:53               ` Valdis.Kletnieks
  2008-02-28 15:08                 ` Michael Kerrisk
  2008-02-28 19:27                 ` Davide Libenzi
  1 sibling, 2 replies; 16+ messages in thread
From: Valdis.Kletnieks @ 2008-02-28 13:53 UTC (permalink / raw)
  To: 7eggert
  Cc: Michael Kerrisk, Davide Libenzi, Pierre Habouzit, lkml,
	Eric Dumazet, Marc Lehmann, David Schwartz

[-- Attachment #1: Type: text/plain, Size: 675 bytes --]

On Tue, 26 Feb 2008 19:16:30 +0100, Bodo Eggert said:
> Michael Kerrisk <mtk.manpages@googlemail.com> wrote:

> >               file  handle  have  been  closed.  This means that
> >               even after a file descriptor that is  part  of  an
> >               epoll  set has been closed, events may be reported
> >               for that file descriptor if other file descriptors
> >               referring  to the same underlying file description
> >               remain open.

Is it worth making special mention of the case where a process gets events
for a FD that it has closed, because a parent or child process still has
an inherited copy of the FD still open?

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-28 13:53               ` Valdis.Kletnieks
@ 2008-02-28 15:08                 ` Michael Kerrisk
  2008-02-28 19:27                 ` Davide Libenzi
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Kerrisk @ 2008-02-28 15:08 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: 7eggert, Davide Libenzi, Pierre Habouzit, lkml, Eric Dumazet,
	Marc Lehmann, David Schwartz

On Thu, Feb 28, 2008 at 2:53 PM,  <Valdis.Kletnieks@vt.edu> wrote:
> On Tue, 26 Feb 2008 19:16:30 +0100, Bodo Eggert said:
>  > Michael Kerrisk <mtk.manpages@googlemail.com> wrote:
>
>
> > >               file  handle  have  been  closed.  This means that
>  > >               even after a file descriptor that is  part  of  an
>  > >               epoll  set has been closed, events may be reported
>  > >               for that file descriptor if other file descriptors
>  > >               referring  to the same underlying file description
>  > >               remain open.
>
>  Is it worth making special mention of the case where a process gets events
>  for a FD that it has closed, because a parent or child process still has
>  an inherited copy of the FD still open?

I'm not sure -- perhaps under a BUGS section?  Did you read my reply
about this point in the thread "Re: epoll design problems with common
fork/exec patterns"?

Cheers,

Michael


-- 
Michael Kerrisk
Maintainer of the Linux man-pages project
http://www.kernel.org/doc/man-pages/
Want to report a man-pages bug?  Look here:
http://www.kernel.org/doc/man-pages/reporting_bugs.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-28 12:10               ` Michael Kerrisk
@ 2008-02-28 19:17                 ` Bodo Eggert
  2008-02-28 19:30                   ` Davide Libenzi
  0 siblings, 1 reply; 16+ messages in thread
From: Bodo Eggert @ 2008-02-28 19:17 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: 7eggert, Davide Libenzi, Pierre Habouzit, lkml, Eric Dumazet,
	Marc Lehmann, David Schwartz

On Thu, 28 Feb 2008, Michael Kerrisk wrote:

> On Tue, Feb 26, 2008 at 7:16 PM, Bodo Eggert <7eggert@gmx.de> wrote:
> > Michael Kerrisk <mtk.manpages@googlemail.com> wrote:
> >
> >  > b) It seems to me that you text doesn't quite make the point explicit
> >  > enough.  I've tried to rewrite it; could you please check:
> >  >
> >  >        A6     Yes, but be aware of the following point.  A  file
> >  >               descriptor is a reference to an open file descrip-
> >  >               tion (see  open(2)).   Whenever  a  descriptor  is
> >  >               duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
[.........]

> >  > Does that seem okay?  I plan to include the text in man-pages-2.79.
> >
> >  It's hard to read for me, and probably very hard to read for others.
> 
> Bodo,
> 
> I'm just reviewing this text, trying to see if I can improve it.  At
> the moment, I'm a little stuck.  can you say a little more about why
> you find it hard to read?  that may help me improve it.

I think it's enough to mention that the last copy of the file descriptor 
(e.g. by dup or fork) must be closed *or* the file must be explicitely 
unregistered (As far as I understand by now).
-- 
"Aim towards the Enemy."
-Instruction printed on US Rocket Launcher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-28 13:53               ` Valdis.Kletnieks
  2008-02-28 15:08                 ` Michael Kerrisk
@ 2008-02-28 19:27                 ` Davide Libenzi
  1 sibling, 0 replies; 16+ messages in thread
From: Davide Libenzi @ 2008-02-28 19:27 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: 7eggert, Michael Kerrisk, Pierre Habouzit, lkml, Eric Dumazet,
	Marc Lehmann, David Schwartz

On Thu, 28 Feb 2008, Valdis.Kletnieks@vt.edu wrote:

> On Tue, 26 Feb 2008 19:16:30 +0100, Bodo Eggert said:
> > Michael Kerrisk <mtk.manpages@googlemail.com> wrote:
> 
> > >               file  handle  have  been  closed.  This means that
> > >               even after a file descriptor that is  part  of  an
> > >               epoll  set has been closed, events may be reported
> > >               for that file descriptor if other file descriptors
> > >               referring  to the same underlying file description
> > >               remain open.
> 
> Is it worth making special mention of the case where a process gets events
> for a FD that it has closed, because a parent or child process still has
> an inherited copy of the FD still open?

And for all the others, there's epoll_ctl(EPOLL_CTL_DEL) :)
The close(2) (f_op->release actually) hook is for cleanup semantics. If 
you play with multiple processes, just use epoll_ctl(EPOLL_CTL_DEL) and 
you'll be fine.



- Davide



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-28 19:17                 ` Bodo Eggert
@ 2008-02-28 19:30                   ` Davide Libenzi
  0 siblings, 0 replies; 16+ messages in thread
From: Davide Libenzi @ 2008-02-28 19:30 UTC (permalink / raw)
  To: Bodo Eggert
  Cc: Michael Kerrisk, Pierre Habouzit, lkml, Eric Dumazet,
	Marc Lehmann, David Schwartz

On Thu, 28 Feb 2008, Bodo Eggert wrote:

> On Thu, 28 Feb 2008, Michael Kerrisk wrote:
> 
> > On Tue, Feb 26, 2008 at 7:16 PM, Bodo Eggert <7eggert@gmx.de> wrote:
> > > Michael Kerrisk <mtk.manpages@googlemail.com> wrote:
> > >
> > >  > b) It seems to me that you text doesn't quite make the point explicit
> > >  > enough.  I've tried to rewrite it; could you please check:
> > >  >
> > >  >        A6     Yes, but be aware of the following point.  A  file
> > >  >               descriptor is a reference to an open file descrip-
> > >  >               tion (see  open(2)).   Whenever  a  descriptor  is
> > >  >               duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
> [.........]
> 
> > >  > Does that seem okay?  I plan to include the text in man-pages-2.79.
> > >
> > >  It's hard to read for me, and probably very hard to read for others.
> > 
> > Bodo,
> > 
> > I'm just reviewing this text, trying to see if I can improve it.  At
> > the moment, I'm a little stuck.  can you say a little more about why
> > you find it hard to read?  that may help me improve it.
> 
> I think it's enough to mention that the last copy of the file descriptor 
> (e.g. by dup or fork) must be closed *or* the file must be explicitely 
> unregistered (As far as I understand by now).

Exactly!
Michael, I noticed that there's a reference to it in Q6. We better clarify 
that one.



- Davide



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-26 19:14               ` Michael Kerrisk
@ 2008-02-26 19:31                 ` Davide Libenzi
  0 siblings, 0 replies; 16+ messages in thread
From: Davide Libenzi @ 2008-02-26 19:31 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Pierre Habouzit, lkml, Eric Dumazet, Marc Lehmann, David Schwartz

On Tue, 26 Feb 2008, Michael Kerrisk wrote:

> Okay -- I'll look at it some more.  I am however loathe to drop the
> term open file description, because POSIX uses, as well as a number of
> other Linux man pages by now.

Heh, POSIX. Now doesn't take a genius to see that "file description" and 
"file descriptor" looks amazingly similar, does it? :)


> > That'd mean placing an eventpoll custom hook into sys_close(). Looks 
> > very bad to me, and probably will look even worse to other kernel 
> > folks. Is not much a performance issue (a check to see if a file* is 
> > an eventpoll file is as easy as comparing the f_op pointer), but a 
> > design/style issue.
>
> Oh -- I wasn't suggesting we could make the change now -- it would
> break the ABI and all that.  I was just wondering why the decision
> wasn't made to do it the other way to begin with.  The existing
> semantics are somewhat couterintuitive, and potentially interact
> libraries that do private manipulations with file descriptors.

For the same reason that a custom hook in sys_close wouldn't have passed 
the radar ;)
As far as problems with libraries doing tricks with fds, that's an issue 
that goes beyond epoll.



- Davide



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-26 19:04             ` Davide Libenzi
@ 2008-02-26 19:14               ` Michael Kerrisk
  2008-02-26 19:31                 ` Davide Libenzi
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Kerrisk @ 2008-02-26 19:14 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Pierre Habouzit, lkml, Eric Dumazet, Marc Lehmann, David Schwartz

On Tue, Feb 26, 2008 at 8:04 PM, Davide Libenzi <davidel@xmailserver.org> wrote:
>
> On Tue, 26 Feb 2008, Michael Kerrisk wrote:
>
> > Following up after quite some time:
> >
> > Davide Libenzi wrote:
> > > On Sat, 26 Jan 2008, Michael Kerrisk wrote:
> > >
> > >> On Jan 25, 2008 12:57 AM, Davide Libenzi <davidel@xmailserver.org> wrote:
> > >>> On Thu, 24 Jan 2008, Pierre Habouzit wrote:
> > >>>
> > >>>> On Fri, Jan 18, 2008 at 09:10:18PM +0000, Davide Libenzi wrote:
> > >>>>> On Fri, 18 Jan 2008, Pierre Habouzit wrote:
> > >>>>>
> > >>>>>>   Hi,
> > >>>>>>
> > >>>>>>   I just came across a strange behavior of epoll that seems to
> > >>>>>> contradict the documentation. Here is what happens:
> > >>>>>>
> > >>>>>> * I have two processes P1 and P2, P1 accept()s connections, and send the
> > >>>>>>   resulting file descriptors to P2 through a unix socket.
> > >>>>>>
> > >>>>>> * P2 registers the received socket in his epollfd.
> > >>>>>>
> > >>>>>>   [time passes]
> > >>>>>>
> > >>>>>> * P2 is done with the socket and closes it
> > >>>>>>
> > >>>>>> * P2 gets events for the socket again !
> > >>>>>>
> > >>>>>>
> > >>>>>>   Though the documentation says that if a process closes a file
> > >>>>>> descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
> > >>>>>> the file descriptor. Though (because of a bug) it was still open in
> > >>>>>> P1[0], hence the referenced socket still live at the kernel level.
> > >>>>>>
> > >>>>>>   Of course the userland workaround is to force the EPOLL_CTL_DEL before
> > >>>>>> the close, which I now do, but costs me a syscall where I wanted to
> > >>>>>> spare one :|
> > >>>>> For epoll, a close is when the kernel file* is released (that is, when all
> > >>>>> its instances are gone).
> > >>>>> We could put a special handling in filp_close(), but I don't think is a
> > >>>>> good idea, and we're better live with the current behaviour.
> > >>>>   Okay, maybe updating the linux manpages to be more clear about that is
> > >>>> the way to go then. Thanks
> > >>> Sure. I'll send Michael Kerrisk and updated statement for the A6 answer in
> > >>> the epoll man page.
> > >> Thanks Davide -- yes please send me a patch.
> > >> --
> > >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > >> the body of a message to majordomo@vger.kernel.org
> > >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >> Please read the FAQ at  http://www.tux.org/lkml/
> > >>
> > >
> > > Something like the one below ...
> > >
> > >
> > > - Davide
> > >
> > >
> > >
> > > --- epoll.4 2008-01-26 12:58:21.000000000 -0800
> > > +++ epoll.4.new     2008-01-26 13:06:36.000000000 -0800
> > > @@ -285,7 +285,19 @@
> > >  sets automatically?
> > >  .TP
> > >  .B A6
> > > -Yes.
> > > +A file descriptor is the userspace counterpart of an internal kernel handle.
> > > +Every time a process calls functions liks
> > > +.BR dup (2),
> > > +.BR dup2 (2)
> > > +or
> > > +.BR fork (2),
> > > +a new file descriptor referring to the same internal kernel handle is
> > > +created. The internal kernel handle remains alive until all the userspace
> > > +file descriptors have been closed.
> > > +The
> > > +.BR epoll (4)
> > > +interface automatically removes the internal kernel handle from the set,
> > > +once all the file descriptor instances have been closed.
> > >  .TP
> > >  .B Q7
> > >  If more than one event occurs between
> >
> > Davide,
> >
> > Two points.
> >
> > a) I did a
> >
> > s/internal kernel handle/open file description/
> >
> > since that is the POSIX term for the internal handle.
> >
> > b) It seems to me that you text doesn't quite make the point explicit
> > enough.  I've tried to rewrite it; could you please check:
> >
> >        A6     Yes, but be aware of the following point.  A  file
> >               descriptor is a reference to an open file descrip-
> >               tion (see  open(2)).   Whenever  a  descriptor  is
> >               duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
> >               or fork(2), a new file descriptor referring to the
> >               same  open  file  description is created.  An open
> >               file description continues to exist until all file
> >               descriptors referring to it have been closed.  The
> >               epoll  interface  automatically  removes  a   file
> >               descriptor  from  an  epoll set only after all the
> >               file descriptors referring to the underlying  open
> >               file  handle  have  been  closed.  This means that
> >               even after a file descriptor that is  part  of  an
> >               epoll  set has been closed, events may be reported
> >               for that file descriptor if other file descriptors
> >               referring  to the same underlying file description
> >               remain open.
> >
> > Does that seem okay?  I plan to include the text in man-pages-2.79.
>
> I agree with Bodo, it is kinda confusing. The name "open file description",
> even though POSIX, looks very similar to "file descriptor".
> I honestly don't know how more easily such concept could be expressed.
> IMHO at least "internal kernel handle" does not play look-alike games with
> "file descriptor".

Okay -- I'll look at it some more.  I am however loathe to drop the
term open file description, because POSIX uses, as well as a number of
other Linux man pages by now.

> > Was there some reason why removing a file descriptor couldn't have been
> > made to do the "expected" thing (i.e., remove notifications for that file
> > descriptor, regardless of whether the underlying file description remains
> > open)?
>
> That'd mean placing an eventpoll custom hook into sys_close(). Looks very
> bad to me, and probably will look even worse to other kernel folks.
> Is not much a performance issue (a check to see if a file* is an eventpoll
> file is as easy as comparing the f_op pointer), but a design/style issue.
> On top of that, the interface is already out by many years, so changing it
> will like going to cause problems.

Oh -- I wasn't suggesting we could make the change now -- it would
break the ABI and all that.  I was just wondering why the decision
wasn't made to do it the other way to begin with.  The existing
semantics are somewhat couterintuitive, and potentially interact
libraries that do private manipulations with file descriptors.

Cheers,

Michael
-- 
Michael Kerrisk
Maintainer of the Linux man-pages project
http://www.kernel.org/doc/man-pages/
Want to report a man-pages bug?  Look here:
http://www.kernel.org/doc/man-pages/reporting_bugs.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-02-26 15:13           ` Michael Kerrisk
@ 2008-02-26 19:04             ` Davide Libenzi
  2008-02-26 19:14               ` Michael Kerrisk
  0 siblings, 1 reply; 16+ messages in thread
From: Davide Libenzi @ 2008-02-26 19:04 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Pierre Habouzit, lkml, Eric Dumazet, Marc Lehmann, David Schwartz

On Tue, 26 Feb 2008, Michael Kerrisk wrote:

> Following up after quite some time:
> 
> Davide Libenzi wrote:
> > On Sat, 26 Jan 2008, Michael Kerrisk wrote:
> > 
> >> On Jan 25, 2008 12:57 AM, Davide Libenzi <davidel@xmailserver.org> wrote:
> >>> On Thu, 24 Jan 2008, Pierre Habouzit wrote:
> >>>
> >>>> On Fri, Jan 18, 2008 at 09:10:18PM +0000, Davide Libenzi wrote:
> >>>>> On Fri, 18 Jan 2008, Pierre Habouzit wrote:
> >>>>>
> >>>>>>   Hi,
> >>>>>>
> >>>>>>   I just came across a strange behavior of epoll that seems to
> >>>>>> contradict the documentation. Here is what happens:
> >>>>>>
> >>>>>> * I have two processes P1 and P2, P1 accept()s connections, and send the
> >>>>>>   resulting file descriptors to P2 through a unix socket.
> >>>>>>
> >>>>>> * P2 registers the received socket in his epollfd.
> >>>>>>
> >>>>>>   [time passes]
> >>>>>>
> >>>>>> * P2 is done with the socket and closes it
> >>>>>>
> >>>>>> * P2 gets events for the socket again !
> >>>>>>
> >>>>>>
> >>>>>>   Though the documentation says that if a process closes a file
> >>>>>> descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
> >>>>>> the file descriptor. Though (because of a bug) it was still open in
> >>>>>> P1[0], hence the referenced socket still live at the kernel level.
> >>>>>>
> >>>>>>   Of course the userland workaround is to force the EPOLL_CTL_DEL before
> >>>>>> the close, which I now do, but costs me a syscall where I wanted to
> >>>>>> spare one :|
> >>>>> For epoll, a close is when the kernel file* is released (that is, when all
> >>>>> its instances are gone).
> >>>>> We could put a special handling in filp_close(), but I don't think is a
> >>>>> good idea, and we're better live with the current behaviour.
> >>>>   Okay, maybe updating the linux manpages to be more clear about that is
> >>>> the way to go then. Thanks
> >>> Sure. I'll send Michael Kerrisk and updated statement for the A6 answer in
> >>> the epoll man page.
> >> Thanks Davide -- yes please send me a patch.
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at  http://www.tux.org/lkml/
> >>
> > 
> > Something like the one below ...
> > 
> > 
> > - Davide
> > 
> > 
> > 
> > --- epoll.4	2008-01-26 12:58:21.000000000 -0800
> > +++ epoll.4.new	2008-01-26 13:06:36.000000000 -0800
> > @@ -285,7 +285,19 @@
> >  sets automatically?
> >  .TP
> >  .B A6
> > -Yes.
> > +A file descriptor is the userspace counterpart of an internal kernel handle.
> > +Every time a process calls functions liks
> > +.BR dup (2),
> > +.BR dup2 (2)
> > +or
> > +.BR fork (2),
> > +a new file descriptor referring to the same internal kernel handle is
> > +created. The internal kernel handle remains alive until all the userspace
> > +file descriptors have been closed.
> > +The
> > +.BR epoll (4)
> > +interface automatically removes the internal kernel handle from the set,
> > +once all the file descriptor instances have been closed.
> >  .TP
> >  .B Q7
> >  If more than one event occurs between
> 
> Davide,
> 
> Two points.
> 
> a) I did a
> 
> s/internal kernel handle/open file description/
> 
> since that is the POSIX term for the internal handle.
> 
> b) It seems to me that you text doesn't quite make the point explicit
> enough.  I've tried to rewrite it; could you please check:
> 
>        A6     Yes, but be aware of the following point.  A  file
>               descriptor is a reference to an open file descrip-
>               tion (see  open(2)).   Whenever  a  descriptor  is
>               duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
>               or fork(2), a new file descriptor referring to the
>               same  open  file  description is created.  An open
>               file description continues to exist until all file
>               descriptors referring to it have been closed.  The
>               epoll  interface  automatically  removes  a   file
>               descriptor  from  an  epoll set only after all the
>               file descriptors referring to the underlying  open
>               file  handle  have  been  closed.  This means that
>               even after a file descriptor that is  part  of  an
>               epoll  set has been closed, events may be reported
>               for that file descriptor if other file descriptors
>               referring  to the same underlying file description
>               remain open.
> 
> Does that seem okay?  I plan to include the text in man-pages-2.79.

I agree with Bodo, it is kinda confusing. The name "open file description",
even though POSIX, looks very similar to "file descriptor".
I honestly don't know how more easily such concept could be expressed. 
IMHO at least "internal kernel handle" does not play look-alike games with 
"file descriptor".



> Was there some reason why removing a file descriptor couldn't have been
> made to do the "expected" thing (i.e., remove notifications for that file
> descriptor, regardless of whether the underlying file description remains
> open)?

That'd mean placing an eventpoll custom hook into sys_close(). Looks very 
bad to me, and probably will look even worse to other kernel folks.
Is not much a performance issue (a check to see if a file* is an eventpoll 
file is as easy as comparing the f_op pointer), but a design/style issue.
On top of that, the interface is already out by many years, so changing it 
will like going to cause problems.



- Davide



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
       [not found]         ` <Pine.LNX.4.64.0801261308270.10472@alien.or.mcafeemobile.com>
@ 2008-02-26 15:13           ` Michael Kerrisk
  2008-02-26 19:04             ` Davide Libenzi
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Kerrisk @ 2008-02-26 15:13 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Pierre Habouzit, lkml, Eric Dumazet, Marc Lehmann, David Schwartz

Following up after quite some time:

Davide Libenzi wrote:
> On Sat, 26 Jan 2008, Michael Kerrisk wrote:
> 
>> On Jan 25, 2008 12:57 AM, Davide Libenzi <davidel@xmailserver.org> wrote:
>>> On Thu, 24 Jan 2008, Pierre Habouzit wrote:
>>>
>>>> On Fri, Jan 18, 2008 at 09:10:18PM +0000, Davide Libenzi wrote:
>>>>> On Fri, 18 Jan 2008, Pierre Habouzit wrote:
>>>>>
>>>>>>   Hi,
>>>>>>
>>>>>>   I just came across a strange behavior of epoll that seems to
>>>>>> contradict the documentation. Here is what happens:
>>>>>>
>>>>>> * I have two processes P1 and P2, P1 accept()s connections, and send the
>>>>>>   resulting file descriptors to P2 through a unix socket.
>>>>>>
>>>>>> * P2 registers the received socket in his epollfd.
>>>>>>
>>>>>>   [time passes]
>>>>>>
>>>>>> * P2 is done with the socket and closes it
>>>>>>
>>>>>> * P2 gets events for the socket again !
>>>>>>
>>>>>>
>>>>>>   Though the documentation says that if a process closes a file
>>>>>> descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
>>>>>> the file descriptor. Though (because of a bug) it was still open in
>>>>>> P1[0], hence the referenced socket still live at the kernel level.
>>>>>>
>>>>>>   Of course the userland workaround is to force the EPOLL_CTL_DEL before
>>>>>> the close, which I now do, but costs me a syscall where I wanted to
>>>>>> spare one :|
>>>>> For epoll, a close is when the kernel file* is released (that is, when all
>>>>> its instances are gone).
>>>>> We could put a special handling in filp_close(), but I don't think is a
>>>>> good idea, and we're better live with the current behaviour.
>>>>   Okay, maybe updating the linux manpages to be more clear about that is
>>>> the way to go then. Thanks
>>> Sure. I'll send Michael Kerrisk and updated statement for the A6 answer in
>>> the epoll man page.
>> Thanks Davide -- yes please send me a patch.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> 
> Something like the one below ...
> 
> 
> - Davide
> 
> 
> 
> --- epoll.4	2008-01-26 12:58:21.000000000 -0800
> +++ epoll.4.new	2008-01-26 13:06:36.000000000 -0800
> @@ -285,7 +285,19 @@
>  sets automatically?
>  .TP
>  .B A6
> -Yes.
> +A file descriptor is the userspace counterpart of an internal kernel handle.
> +Every time a process calls functions liks
> +.BR dup (2),
> +.BR dup2 (2)
> +or
> +.BR fork (2),
> +a new file descriptor referring to the same internal kernel handle is
> +created. The internal kernel handle remains alive until all the userspace
> +file descriptors have been closed.
> +The
> +.BR epoll (4)
> +interface automatically removes the internal kernel handle from the set,
> +once all the file descriptor instances have been closed.
>  .TP
>  .B Q7
>  If more than one event occurs between

Davide,

Two points.

a) I did a

s/internal kernel handle/open file description/

since that is the POSIX term for the internal handle.

b) It seems to me that you text doesn't quite make the point explicit
enough.  I've tried to rewrite it; could you please check:

       A6     Yes, but be aware of the following point.  A  file
              descriptor is a reference to an open file descrip-
              tion (see  open(2)).   Whenever  a  descriptor  is
              duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
              or fork(2), a new file descriptor referring to the
              same  open  file  description is created.  An open
              file description continues to exist until all file
              descriptors referring to it have been closed.  The
              epoll  interface  automatically  removes  a   file
              descriptor  from  an  epoll set only after all the
              file descriptors referring to the underlying  open
              file  handle  have  been  closed.  This means that
              even after a file descriptor that is  part  of  an
              epoll  set has been closed, events may be reported
              for that file descriptor if other file descriptors
              referring  to the same underlying file description
              remain open.

Does that seem okay?  I plan to include the text in man-pages-2.79.

Was there some reason why removing a file descriptor couldn't have been
made to do the "expected" thing (i.e., remove notifications for that file
descriptor, regardless of whether the underlying file description remains
open)?

Cheers,

Michael

-- 
Michael Kerrisk
Maintainer of the Linux man-pages project
http://www.kernel.org/doc/man-pages/
Want to report a man-pages bug?  Look here:
http://www.kernel.org/doc/man-pages/reporting_bugs.html



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-01-24 23:57     ` Davide Libenzi
@ 2008-01-26  7:37       ` Michael Kerrisk
       [not found]         ` <Pine.LNX.4.64.0801261308270.10472@alien.or.mcafeemobile.com>
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Kerrisk @ 2008-01-26  7:37 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Pierre Habouzit, Linux Kernel Mailing List, mtk/manpages

On Jan 25, 2008 12:57 AM, Davide Libenzi <davidel@xmailserver.org> wrote:
>
> On Thu, 24 Jan 2008, Pierre Habouzit wrote:
>
> > On Fri, Jan 18, 2008 at 09:10:18PM +0000, Davide Libenzi wrote:
> > > On Fri, 18 Jan 2008, Pierre Habouzit wrote:
> > >
> > > >   Hi,
> > > >
> > > >   I just came across a strange behavior of epoll that seems to
> > > > contradict the documentation. Here is what happens:
> > > >
> > > > * I have two processes P1 and P2, P1 accept()s connections, and send the
> > > >   resulting file descriptors to P2 through a unix socket.
> > > >
> > > > * P2 registers the received socket in his epollfd.
> > > >
> > > >   [time passes]
> > > >
> > > > * P2 is done with the socket and closes it
> > > >
> > > > * P2 gets events for the socket again !
> > > >
> > > >
> > > >   Though the documentation says that if a process closes a file
> > > > descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
> > > > the file descriptor. Though (because of a bug) it was still open in
> > > > P1[0], hence the referenced socket still live at the kernel level.
> > > >
> > > >   Of course the userland workaround is to force the EPOLL_CTL_DEL before
> > > > the close, which I now do, but costs me a syscall where I wanted to
> > > > spare one :|
> > >
> > > For epoll, a close is when the kernel file* is released (that is, when all
> > > its instances are gone).
> > > We could put a special handling in filp_close(), but I don't think is a
> > > good idea, and we're better live with the current behaviour.
> >
> >   Okay, maybe updating the linux manpages to be more clear about that is
> > the way to go then. Thanks
>
> Sure. I'll send Michael Kerrisk and updated statement for the A6 answer in
> the epoll man page.

Thanks Davide -- yes please send me a patch.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-01-24  8:40   ` Pierre Habouzit
@ 2008-01-24 23:57     ` Davide Libenzi
  2008-01-26  7:37       ` Michael Kerrisk
  0 siblings, 1 reply; 16+ messages in thread
From: Davide Libenzi @ 2008-01-24 23:57 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Linux Kernel Mailing List

On Thu, 24 Jan 2008, Pierre Habouzit wrote:

> On Fri, Jan 18, 2008 at 09:10:18PM +0000, Davide Libenzi wrote:
> > On Fri, 18 Jan 2008, Pierre Habouzit wrote:
> > 
> > >   Hi,
> > > 
> > >   I just came across a strange behavior of epoll that seems to
> > > contradict the documentation. Here is what happens:
> > > 
> > > * I have two processes P1 and P2, P1 accept()s connections, and send the
> > >   resulting file descriptors to P2 through a unix socket.
> > > 
> > > * P2 registers the received socket in his epollfd.
> > > 
> > >   [time passes]
> > > 
> > > * P2 is done with the socket and closes it
> > > 
> > > * P2 gets events for the socket again !
> > > 
> > > 
> > >   Though the documentation says that if a process closes a file
> > > descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
> > > the file descriptor. Though (because of a bug) it was still open in
> > > P1[0], hence the referenced socket still live at the kernel level.
> > > 
> > >   Of course the userland workaround is to force the EPOLL_CTL_DEL before
> > > the close, which I now do, but costs me a syscall where I wanted to
> > > spare one :|
> > 
> > For epoll, a close is when the kernel file* is released (that is, when all 
> > its instances are gone).
> > We could put a special handling in filp_close(), but I don't think is a 
> > good idea, and we're better live with the current behaviour.
> 
>   Okay, maybe updating the linux manpages to be more clear about that is
> the way to go then. Thanks

Sure. I'll send Michael Kerrisk and updated statement for the A6 answer in 
the epoll man page.



- Davide



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-01-18 21:10 ` Davide Libenzi
@ 2008-01-24  8:40   ` Pierre Habouzit
  2008-01-24 23:57     ` Davide Libenzi
  0 siblings, 1 reply; 16+ messages in thread
From: Pierre Habouzit @ 2008-01-24  8:40 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1612 bytes --]

On Fri, Jan 18, 2008 at 09:10:18PM +0000, Davide Libenzi wrote:
> On Fri, 18 Jan 2008, Pierre Habouzit wrote:
> 
> >   Hi,
> > 
> >   I just came across a strange behavior of epoll that seems to
> > contradict the documentation. Here is what happens:
> > 
> > * I have two processes P1 and P2, P1 accept()s connections, and send the
> >   resulting file descriptors to P2 through a unix socket.
> > 
> > * P2 registers the received socket in his epollfd.
> > 
> >   [time passes]
> > 
> > * P2 is done with the socket and closes it
> > 
> > * P2 gets events for the socket again !
> > 
> > 
> >   Though the documentation says that if a process closes a file
> > descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
> > the file descriptor. Though (because of a bug) it was still open in
> > P1[0], hence the referenced socket still live at the kernel level.
> > 
> >   Of course the userland workaround is to force the EPOLL_CTL_DEL before
> > the close, which I now do, but costs me a syscall where I wanted to
> > spare one :|
> 
> For epoll, a close is when the kernel file* is released (that is, when all 
> its instances are gone).
> We could put a special handling in filp_close(), but I don't think is a 
> good idea, and we're better live with the current behaviour.

  Okay, maybe updating the linux manpages to be more clear about that is
the way to go then. Thanks

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: epoll and shared fd's
  2008-01-18 13:43 Pierre Habouzit
@ 2008-01-18 21:10 ` Davide Libenzi
  2008-01-24  8:40   ` Pierre Habouzit
  0 siblings, 1 reply; 16+ messages in thread
From: Davide Libenzi @ 2008-01-18 21:10 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: linux-kernel

On Fri, 18 Jan 2008, Pierre Habouzit wrote:

>   Hi,
> 
>   I just came across a strange behavior of epoll that seems to
> contradict the documentation. Here is what happens:
> 
> * I have two processes P1 and P2, P1 accept()s connections, and send the
>   resulting file descriptors to P2 through a unix socket.
> 
> * P2 registers the received socket in his epollfd.
> 
>   [time passes]
> 
> * P2 is done with the socket and closes it
> 
> * P2 gets events for the socket again !
> 
> 
>   Though the documentation says that if a process closes a file
> descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
> the file descriptor. Though (because of a bug) it was still open in
> P1[0], hence the referenced socket still live at the kernel level.
> 
>   Of course the userland workaround is to force the EPOLL_CTL_DEL before
> the close, which I now do, but costs me a syscall where I wanted to
> spare one :|

For epoll, a close is when the kernel file* is released (that is, when all 
its instances are gone).
We could put a special handling in filp_close(), but I don't think is a 
good idea, and we're better live with the current behaviour.



- Davide



^ permalink raw reply	[flat|nested] 16+ messages in thread

* epoll and shared fd's
@ 2008-01-18 13:43 Pierre Habouzit
  2008-01-18 21:10 ` Davide Libenzi
  0 siblings, 1 reply; 16+ messages in thread
From: Pierre Habouzit @ 2008-01-18 13:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: madcoder

[-- Attachment #1: Type: text/plain, Size: 1595 bytes --]

  Hi,

  I just came across a strange behavior of epoll that seems to
contradict the documentation. Here is what happens:

* I have two processes P1 and P2, P1 accept()s connections, and send the
  resulting file descriptors to P2 through a unix socket.

* P2 registers the received socket in his epollfd.

  [time passes]

* P2 is done with the socket and closes it

* P2 gets events for the socket again !


  Though the documentation says that if a process closes a file
descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
the file descriptor. Though (because of a bug) it was still open in
P1[0], hence the referenced socket still live at the kernel level.

  Of course the userland workaround is to force the EPOLL_CTL_DEL before
the close, which I now do, but costs me a syscall where I wanted to
spare one :|

  I _believe_ this is if not a bug, at least a misfeature, hence I'm
reporting the issue :)


PS: please Cc: on answers me I'm not subscribed.


  [0] and despite the bug in our software that leaked the socket, P1
      is supposed to only close the socket when P2 acks the fact that it
      received a valid fd (else P1 tries to send it to a P2'), and there
      may be uncontrollable races that could trigger the issue again
      (with P2 closing the socket before P1 had time to process the ACK
      and close the socket on its end).
-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-02-28 19:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <9MZLT-1YO-33@gated-at.bofh.it>
     [not found] ` <9N6Ng-5tn-21@gated-at.bofh.it>
     [not found]   ` <9P5WE-33i-11@gated-at.bofh.it>
     [not found]     ` <9Pk9l-1KA-1@gated-at.bofh.it>
     [not found]       ` <9PNNZ-b0-5@gated-at.bofh.it>
     [not found]         ` <a19Lb-1F0-13@gated-at.bofh.it>
     [not found]           ` <a19Lb-1F0-11@gated-at.bofh.it>
2008-02-26 18:16             ` epoll and shared fd's Bodo Eggert
2008-02-28 12:10               ` Michael Kerrisk
2008-02-28 19:17                 ` Bodo Eggert
2008-02-28 19:30                   ` Davide Libenzi
2008-02-28 13:53               ` Valdis.Kletnieks
2008-02-28 15:08                 ` Michael Kerrisk
2008-02-28 19:27                 ` Davide Libenzi
2008-01-18 13:43 Pierre Habouzit
2008-01-18 21:10 ` Davide Libenzi
2008-01-24  8:40   ` Pierre Habouzit
2008-01-24 23:57     ` Davide Libenzi
2008-01-26  7:37       ` Michael Kerrisk
     [not found]         ` <Pine.LNX.4.64.0801261308270.10472@alien.or.mcafeemobile.com>
2008-02-26 15:13           ` Michael Kerrisk
2008-02-26 19:04             ` Davide Libenzi
2008-02-26 19:14               ` Michael Kerrisk
2008-02-26 19:31                 ` Davide Libenzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).