LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Sébastien Dugué" <sebastien.dugue@bull.net>
To: Zach Brown <zach.brown@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-aio@kvack.org,
	drepper@redhat.com, suparna@in.ibm.com, pbadari@us.ibm.com,
	hch@infradead.org, johnpol@2ka.mipt.ru, davem@davemloft.net,
	Andrew Morton <akpm@osdl.org>,
	Jean Pierre Dion <jean-pierre.dion@bull.net>
Subject: Re: [Resend PATCH AIO 1/2] Add AIO completion notification
Date: Thu, 14 Sep 2006 15:49:20 +0200	[thread overview]
Message-ID: <1158241761.3890.42.camel@frecb000686> (raw)
In-Reply-To: <45073A35.10600@oracle.com>

On Tue, 2006-09-12 at 15:52 -0700, Zach Brown wrote:
> >                       AIO completion notification
> 
> Thanks for sending this again.
> 
> >   The current 2.6 kernel does not support notification of user space via
> > an RT signal upon an asynchronous IO completion. The POSIX specification
> > states that when an AIO request completes, a signal can be delivered to
> > the application as notification.
> 
> Does it say anything about the reliability of that signal delivery?  The
> current implementation lets IO submission succeed but signal delivery
> fail and that makes me nervous.  What are the API design goals here?  If
> I missed some previous discussion could we have them summarized in a
> comment above aio_send_signal() ?  More inline..

  I could not find anything concerning signal delivery reliability, but
POSIX AIO being part of the realtime POSIX extension, I'd tend to think
reliability should be taken into acount.

  Aside from that could you elaborate on the signal delivery
failures you have in mind? From what I gathered, the only failure I can
see is if __sigqueue_alloc() fails to allocate from the cache or we
reached the RLIMIT_SIGPENDING limit for the task.

> 
> > +static void aio_send_signal(struct aio_notify *notify)
> > +{
> > +	struct siginfo info;
> 
> That's pretty big to be putting on the stack down under aio_complete()
> which itself is called from block -> fs completion handlers.  It's a
> path with pretty strong stack pressure.  It looks like we only use a
> fraction of the struct, too, so can we rework the signal delivery code a
> little to allow us to specify our inputs more efficiently?  I guess it's
> not a big deal, but it'd be nice to get right if we can.

  Right, I'm open to suggestions here.


> 
> > +	info.si_code = SI_ASYNCIO;
> 
> It looks like USB (usbfs?  urb mumble something?) is already using
> SI_ASYNCIO.  Is that a problem?

  Well SusV3 states: SI_ASYNCIO - Signal generated by completion of an
asynchronous I/O request.

  From what I understand, the usb core code uses SI_ASYNCIO in two
places:
     - in usbfs_remove_device() and it's a mystery to me why it needs to
       set si_code

     - in async_completed() to notify upon completion of an async IO
       operation which is consistent with the specification

  Anyway, I don't see it as a problem, but maybe I'm missing something.



> 
> > +	if (notify->notify & SIGEV_THREAD_ID)
> > +		ret = specific_send_sig_info(notify->signo, &info, p);
> > +	else
> > +		ret = __group_send_sig_info(notify->signo, &info, p);
> > +
> > +
> > +	spin_unlock_irqrestore(&p->sighand->siglock, flags);
> > +
> > +	if (ret)
> > +		printk(KERN_DEBUG "aio_send_signal: failed to send signal %d to %d\n",
> > +		       notify->signo, notify->pid);
> 
> It looks like this is trivial to make fail if one submits more
> operations than the RLIMIT_SIGPENDING rlimit will allow queued signals
> for.  So we definitely don't want a printk that the user can generate.

  Argh, yes.

> 
> This is where I'm nervous about what promises we've made about signal
> delivery back at submit time :).  If the posix API allows lossy delivery
> then we're OK.
> 
> And it can fail if delivery can't allocate a struct sigqueue.  I wonder
> if we should allocate one at submit time and hang it off the iocb to be
> used at completion.  It's just a little struct with a list_head.

  That may be a solution if signal delivery reliability is 
critical, so that we fail at submission time.

> 
> > +static struct task_struct * good_sigevent(sigevent_t * event)
> 
> This is lifted directly from kernel/posix-timers.c, including the lack
> of any indication that it has to be called with the tasklist_lock held
> 'cause it calls find_task_by_pid().  If we're making this a shared
> notion lets put it somewhere that both posix-timers.c and aio.c can get
> at.  kernel/signal.c, presumably?

  Good point, I will do that along with a comment stating the locking 
constraints.

> 
> > +	if (!access_ok(VERIFY_READ, user_event, sizeof(struct sigevent)))
> > +		return -EFAULT;
> 
> You don't need this access_ok(), I don't think.  copy_from_user() should
> return non-zero if something goes wrong.

  OK, but then the comment for __copy_from_user_inatomic() in
asm-*/uaccess.h should be changed ("Caller must check the specified
block with access_ok() before calling this function").

> 
> > +	if (copy_from_user(&event, user_event, sizeof (event)))
> > +		return -EFAULT;
> 
> sigevent is chock-full of members that need compat help when 32bit
> userspace hands it to a 64bit kernel.  See compat_sys_timer_create() ->
> get_compat_sigevent().
> 
> I'm not sure how best to handle this.  It'd be pretty gross to allocate
> compat copies of every sigevent on the stack like compat_sys_io_submit()
> seems to do for the 32bit iocb pointers.  I imagine you'll want to
> conditionally call get_compat_sigevent() for each sigevent somehow.
> 
> Or push the problem to userspace by pointing from the iocb to a special
> case sigevent with fixed width notify, signo, and value.
> 
> None of those options seem thrilling :/.

  Well, I'm not sure either and I'm open for discussion.


> 
> > +static void __aio_write_evt(struct kioctx *ctx, struct io_event *event)
> 
> Making this its own function seems to be unrelated to the completion
> signaling work.  Please drop it from the patch so that we don't have to
> audit it to make sure that code motion didn't mess something up.

  Oops, that slipped past into the patch, will remove.


  Thanks, for your comments.

  Sébastien.

-- 
-----------------------------------------------------

  Sébastien Dugué                BULL/FREC:B1-247
  phone: (+33) 476 29 77 70      Bullcom: 229-7770

  mailto:sebastien.dugue@bull.net

  Linux POSIX AIO: http://www.bullopensource.org/posix
                   http://sourceforge.net/projects/paiol

-----------------------------------------------------


      reply	other threads:[~2006-09-14 13:49 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-12  7:44 Sébastien Dugué
2006-09-12 22:52 ` Zach Brown
2006-09-14 13:49   ` Sébastien Dugué [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1158241761.3890.42.camel@frecb000686 \
    --to=sebastien.dugue@bull.net \
    --cc=akpm@osdl.org \
    --cc=davem@davemloft.net \
    --cc=drepper@redhat.com \
    --cc=hch@infradead.org \
    --cc=jean-pierre.dion@bull.net \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    --cc=suparna@in.ibm.com \
    --cc=zach.brown@oracle.com \
    --subject='Re: [Resend PATCH AIO 1/2] Add AIO completion notification' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).