LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Johann Borck <johann.borck@densedata.com>,
	Ulrich Drepper <drepper@redhat.com>,
	Ulrich Drepper <drepper@gmail.com>,
	lkml <linux-kernel@vger.kernel.org>,
	David Miller <davem@davemloft.net>, Andrew Morton <akpm@osdl.org>,
	netdev <netdev@vger.kernel.org>,
	Zach Brown <zach.brown@oracle.com>,
	Christoph Hellwig <hch@infradead.org>,
	Chase Venters <chase.venters@clientec.com>
Subject: Re: [take19 1/4] kevent: Core files.
Date: Tue, 17 Oct 2006 19:09:30 +0400	[thread overview]
Message-ID: <20061017150929.GA2288@2ka.mipt.ru> (raw)
In-Reply-To: <200610171625.00515.dada1@cosmosbay.com>

On Tue, Oct 17, 2006 at 04:25:00PM +0200, Eric Dumazet (dada1@cosmosbay.com) wrote:
> On Tuesday 17 October 2006 16:07, Evgeniy Polyakov wrote:
> > On Tue, Oct 17, 2006 at 03:52:34PM +0200, Eric Dumazet (dada1@cosmosbay.com) 
> wrote:
> > > > What about the case, which I described in other e-mail, when in case of
> > > > the full ring buffer, no new events are written there, and when
> > > > userspace commits (i.e. marks as ready to be freed or requeued by
> > > > kernel) some events, new ones will be copied from ready queue into the
> > > > buffer?
> > >
> > > Then, user might receive 'false events', exactly like
> > > poll()/select()/epoll() can do sometime. IE a 'ready' indication while
> > > there is no current event available on a particular fd / event_source.
> >
> > Only if user simultaneously uses oth interfaces and remove even from the
> > queue when it's copy was in mapped buffer, but in that case it's user's
> > problem (and if we do want, we can store pointer/index of the ring
> > buffer entry, so when event is removed from the ready queue (using
> > kevent_get_events()), appropriate entry in the ring buffer will be
> > updated to show that it is no longer valid.
> >
> > > This should be safe, since those programs already ignore read()
> > > returns -EAGAIN and other similar things.
> > >
> > > Programmer prefers to receive two 'event available' indications than ZERO
> > > (and be stuck for infinite time). Of course, hot path (normal cases)
> > > should return one 'event' only.
> > >
> > > In order words, being ultra fast 99.99 % of the time, but being able to
> > > block forever once in a while is not an option.
> >
> > Have I missed something? It looks like the only problematic situation is
> > described above when user simultaneously uses both interfaces.
> 
> In my point of view, user of the 'mmaped ring buffer' should be prepared to 
> use both interfaces. Or else you are forced to presize the ring buffer to 
> insane limits.
> 
> That is :
> - Most of the time, we expect consuming events via mmaped ring buffer and no 
> syscalls.
> - In case we notice a 'mmaped ring buffer overflow', syscalls to get/consume 
> events that could not be stored in mmaped buffer (but queued by kevent 
> subsystem). If not stored by kevent subsystem (memory failure ?), revert to 
> poll() to fetch all 'missed fds' in one row. Go back to normal mode.

kevent uses smaller amount of memory than epoll() per event, so it is very
unlikely that it will be impossible to store new event there and epoll()
will succeed. The same can be applied to poll(), which allocates the
whole table in syscall.

> - In case of empty ring buffer (or no mmap support at all, because this app 
> doesnt expect lot of events per time unit, or because kevent dont have mmap 
> support) : Be able to syscall and wait for an event.

So the most complex case is when user is going to use both interfaces,
and it's steps when mapped ring buffer has overflow.
In that case user can either read and mark some events as ready in ring
buffer (the latter is being done through special syscall), so kevent
core will put there new ready events.
User can also get events using usual syscall, in that case events in
ring buffer must be updated - and actually I implemented mapped buffer
in the way which allows to remove events from the queue - queue is a
FIFO, and the first entry to be obtained through syscall is _always_ the
first entry in the ring buffer.

So when user reads event through syscall (no matter if we are in overflow
case or not), even being read is easily accessible in the ring buffer.

So I propose following design for ring buffer (quite simple):
kernelspace maintains two indexes - to the first and the last events in
the ring buffer (and maximum size of the buffer of course).
When new event is marked as ready, some info is being copied into ring
buffer and index of the last entry is increased.
When event is being read through syscall it is _guaranteed_ that that 
event will be at the position pointed by the index of the first
element, that index is then increased (thus opening new slot in the
buffer).
If index of the last entry reaches (with possible wrapping) index of the
first entry, that means that overflow has happend. In this case no new
events can be copied into ring buffer, so they are only placed into
ready queue (accessible through syscall kevent_get_events()).

When user calls kevent_get_events() it will obtain the first element
(pointed by index of the first element in the ring buffer), and if there
is ready event, which is not placed into the ring buffer, it is
copied (with appropriate update of the last index and new overflow
condition).

When userspace calls kevent_wait(num), it means that userspace marks as
ready first (from index of the first element) $num elements, which thus
can be removed (or requeued) and replaced by pending ready events.

Does it sound like clawing over the glass or much better?

> Eric
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
	Evgeniy Polyakov

  reply	other threads:[~2006-10-17 15:10 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <115a6230591036@2ka.mipt.ru>
2006-09-12  8:41 ` [take18 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-09-12  8:41   ` [take18 1/4] kevent: Core files Evgeniy Polyakov
2006-09-12  8:41     ` [take18 2/4] kevent: poll/select() notifications Evgeniy Polyakov
2006-09-12  8:41       ` [take18 3/4] kevent: Socket notifications Evgeniy Polyakov
2006-09-12  8:41         ` [take18 4/4] kevent: Timer notifications Evgeniy Polyakov
2006-09-20  9:35 ` [take19 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-09-20  9:35   ` [take19 1/4] kevent: Core files Evgeniy Polyakov
2006-09-20  9:35     ` [take19 2/4] kevent: poll/select() notifications Evgeniy Polyakov
2006-09-20  9:35       ` [take19 3/4] kevent: Socket notifications Evgeniy Polyakov
2006-09-20  9:35         ` [take19 4/4] kevent: Timer notifications Evgeniy Polyakov
2006-10-04  6:34     ` [take19 1/4] kevent: Core files Ulrich Drepper
2006-10-04  6:48       ` Evgeniy Polyakov
2006-10-04 17:57         ` Ulrich Drepper
2006-10-05  8:57           ` Evgeniy Polyakov
2006-10-05  9:56             ` Eric Dumazet
2006-10-05 10:21               ` Evgeniy Polyakov
2006-10-05 10:45                 ` Eric Dumazet
2006-10-05 10:55                   ` Evgeniy Polyakov
2006-10-05 12:09                     ` Eric Dumazet
2006-10-05 12:37                       ` Evgeniy Polyakov
2006-10-15 23:22                         ` Ulrich Drepper
2006-10-16  7:33                           ` Evgeniy Polyakov
2006-10-16 10:16                             ` Ulrich Drepper
2006-10-16 11:23                               ` Evgeniy Polyakov
2006-10-17  5:10                           ` Johann Borck
2006-10-17  5:59                             ` Chase Venters
2006-10-17 10:42                               ` Evgeniy Polyakov
2006-10-17 13:12                                 ` Chase Venters
2006-10-17 13:35                                   ` Evgeniy Polyakov
2006-10-17 10:39                             ` Evgeniy Polyakov
2006-10-17 13:19                               ` Eric Dumazet
2006-10-17 13:42                                 ` Evgeniy Polyakov
2006-10-17 13:52                                   ` Eric Dumazet
2006-10-17 14:07                                     ` Evgeniy Polyakov
2006-10-17 14:25                                       ` Eric Dumazet
2006-10-17 15:09                                         ` Evgeniy Polyakov [this message]
2006-10-17 15:32                                           ` Eric Dumazet
2006-10-17 16:01                                             ` Evgeniy Polyakov
2006-10-17 16:26                                               ` Eric Dumazet
2006-10-17 16:35                                                 ` Evgeniy Polyakov
2006-10-17 16:45                                                   ` Eric Dumazet
2006-10-18  4:10                                                     ` Evgeniy Polyakov
2006-10-18  4:45                                                       ` Eric Dumazet
2006-10-17 15:33                                         ` Hans Henrik Happe
2006-10-05 14:01                 ` Hans Henrik Happe
2006-10-05 14:15                   ` Evgeniy Polyakov
2006-10-05 15:07                     ` Hans Henrik Happe
2006-09-22 19:22   ` [take19 0/4] kevent: Generic event handling mechanism Andrew Morton
2006-09-23  4:23     ` Evgeniy Polyakov
2006-10-04  6:09       ` Ulrich Drepper
2006-10-04  6:10         ` Ulrich Drepper
2006-10-04  6:27           ` Evgeniy Polyakov
2006-10-04  6:24         ` Evgeniy Polyakov
2006-09-26 15:54     ` Christoph Hellwig
2006-09-27  4:46       ` Evgeniy Polyakov
2006-09-27 15:09   ` Evgeniy Polyakov
2006-10-04  4:50     ` Ulrich Drepper
2006-10-04  4:55       ` Evgeniy Polyakov
2006-10-04  7:33         ` Ulrich Drepper
2006-10-04  7:48           ` Evgeniy Polyakov
2006-10-04 17:20             ` Ulrich Drepper
2006-10-05  9:02               ` Evgeniy Polyakov
2006-10-05 14:45                 ` Ulrich Drepper
2006-10-06  8:36                   ` Evgeniy Polyakov
2006-10-15 22:43                     ` Ulrich Drepper
2006-10-16  7:23                       ` Evgeniy Polyakov
2006-10-16  9:59                         ` Ulrich Drepper
2006-10-16 10:38                           ` Evgeniy Polyakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061017150929.GA2288@2ka.mipt.ru \
    --to=johnpol@2ka.mipt.ru \
    --cc=akpm@osdl.org \
    --cc=chase.venters@clientec.com \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=drepper@gmail.com \
    --cc=drepper@redhat.com \
    --cc=hch@infradead.org \
    --cc=johann.borck@densedata.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=zach.brown@oracle.com \
    --subject='Re: [take19 1/4] kevent: Core files.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).