LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
@ 2007-08-17 15:41 John Blackwood
  2007-08-17 20:55 ` Neil Horman
  0 siblings, 1 reply; 10+ messages in thread
From: John Blackwood @ 2007-08-17 15:41 UTC (permalink / raw)
  To: Neil Horman
  Cc: linux-rt-users, linux-kernel, Sven-Thorsten Dietrich,
	Gregory Haskins, Tom Horsley

Hi Neil,

We've been having problems with this select patch change.

Specifically -- previously, when a ptrace attach was done to a task
blocked in a select() call and that task had a timeout value,
the task would restart the select() call with an updated timeout value.

With this patch in place, the task now instead returns EINTR.

A test that shows this issue is provided below.

We also confirmed that attaching to a program sitting
in select() with gdb makes the select get an EINTR, so this
behavior also shows up in gdb.

Thank you for your considerations in this matter.

------------------- -------------------
#include <stdio.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <time.h>
#include <signal.h>
#include <sys/wait.h>
#include <sys/ptrace.h>

int
main(int argc, char ** argv) {
    pid_t kid;
    if ((kid = fork()) == 0) {
       int ms_wait = 2000;
       int rval;
       struct timeval timeout;

       timeout.tv_sec = ms_wait / 1000;
       timeout.tv_usec = (ms_wait % 1000) * 1000;
       rval = select(0, NULL, NULL, NULL, &timeout);
       if (rval == -1) {
          int errcode = errno;
          printf("Hey! Why did my select error off with errno %d (%s)?\n",
                 errcode, strerror(errcode));
          fflush(stdout);
       } else {
          printf("select call completed, return value: %d\n", rval);
       }
       exit(0);
    } else if (kid == (pid_t)-1) {
       perror("fork");
    } else {
       int ms_wait = 500;
       int rval;
       struct timeval timeout;

       /* Wait a bit to make sure kid has a chance to get into its
        * select call
        */
       timeout.tv_sec = ms_wait / 1000;
       timeout.tv_usec = (ms_wait % 1000) * 1000;
       rval = select(0, NULL, NULL, NULL, &timeout);

       /* Now attach to the kid, then continue him.
        */
       if (ptrace(PTRACE_ATTACH, kid, (void *)0, (void *)0) != 0) {
          perror("ptrace");
       }
       if (waitpid(kid, &rval, 0) != kid) {
          perror("waitpid");
       }
       if (ptrace(PTRACE_CONT, kid, (void *)0, (void *)0) != 0) {
          perror("ptrace");
       }
       if (waitpid(kid, &rval, 0) != kid) {
          perror("waitpid");
       }
    }
    return 0;
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
  2007-08-17 15:41 [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace John Blackwood
@ 2007-08-17 20:55 ` Neil Horman
  0 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2007-08-17 20:55 UTC (permalink / raw)
  To: John Blackwood
  Cc: linux-rt-users, linux-kernel, Sven-Thorsten Dietrich,
	Gregory Haskins, Tom Horsley

On Fri, Aug 17, 2007 at 11:41:20AM -0400, John Blackwood wrote:
> Hi Neil,
> 
> We've been having problems with this select patch change.
> 
> Specifically -- previously, when a ptrace attach was done to a task
> blocked in a select() call and that task had a timeout value,
> the task would restart the select() call with an updated timeout value.
> 
> With this patch in place, the task now instead returns EINTR.
> 
> A test that shows this issue is provided below.
> 
> We also confirmed that attaching to a program sitting
> in select() with gdb makes the select get an EINTR, so this
> behavior also shows up in gdb.
> 
> Thank you for your considerations in this matter.
> 
I'm on vacation at the moment, but will look at this more closely when I get
back on tuesday
Regards
Neil

> ------------------- -------------------
> #include <stdio.h>
> #include <sys/time.h>
> #include <sys/types.h>
> #include <unistd.h>
> #include <errno.h>
> #include <stdlib.h>
> #include <string.h>
> #include <ctype.h>
> #include <time.h>
> #include <signal.h>
> #include <sys/wait.h>
> #include <sys/ptrace.h>
> 
> int
> main(int argc, char ** argv) {
>    pid_t kid;
>    if ((kid = fork()) == 0) {
>       int ms_wait = 2000;
>       int rval;
>       struct timeval timeout;
> 
>       timeout.tv_sec = ms_wait / 1000;
>       timeout.tv_usec = (ms_wait % 1000) * 1000;
>       rval = select(0, NULL, NULL, NULL, &timeout);
>       if (rval == -1) {
>          int errcode = errno;
>          printf("Hey! Why did my select error off with errno %d (%s)?\n",
>                 errcode, strerror(errcode));
>          fflush(stdout);
>       } else {
>          printf("select call completed, return value: %d\n", rval);
>       }
>       exit(0);
>    } else if (kid == (pid_t)-1) {
>       perror("fork");
>    } else {
>       int ms_wait = 500;
>       int rval;
>       struct timeval timeout;
> 
>       /* Wait a bit to make sure kid has a chance to get into its
>        * select call
>        */
>       timeout.tv_sec = ms_wait / 1000;
>       timeout.tv_usec = (ms_wait % 1000) * 1000;
>       rval = select(0, NULL, NULL, NULL, &timeout);
> 
>       /* Now attach to the kid, then continue him.
>        */
>       if (ptrace(PTRACE_ATTACH, kid, (void *)0, (void *)0) != 0) {
>          perror("ptrace");
>       }
>       if (waitpid(kid, &rval, 0) != kid) {
>          perror("waitpid");
>       }
>       if (ptrace(PTRACE_CONT, kid, (void *)0, (void *)0) != 0) {
>          perror("ptrace");
>       }
>       if (waitpid(kid, &rval, 0) != kid) {
>          perror("waitpid");
>       }
>    }
>    return 0;
> }

-- 
/***************************************************
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 *nhorman@tuxdriver.com
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***************************************************/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
  2007-01-24  5:59 ` David Miller
@ 2007-01-24 13:21   ` Neil Horman
  0 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2007-01-24 13:21 UTC (permalink / raw)
  To: David Miller; +Cc: linux-kernel, akpm, torvalds

On Tue, Jan 23, 2007 at 09:59:26PM -0800, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Tue, 16 Jan 2007 15:13:32 -0500
> 
> > As it is currently written, sys_select checks its return code to convert
> > ERESTARTNOHAND to EINTR.  However, the check is within an if (tvp) clause, and
> > so if select is called from userspace with a NULL timeval, then it is possible
> > for the ERESTARTNOHAND errno to leak into userspace, which is incorrect.  This
> > patch moves that check outside of the conditional, and prevents the errno leak.
> > 
> > Signed-Off-By: Neil Horman <nhorman@tuxdriver.com>
> 
> As has been probably mentioned, this change is bogus.
> 
> core_sys_select() returns -ERESTARTNOHAND in exactly the
> case where that return value is legal, when a signal is
> pending, so that the signal dispatch code (on return from
> the system call) will "fixup" the -ERESTARTNOHAND return
> value so that userspace does not see it.
> 
> What could be happening is that, somehow, we see the signal
> pending in core_sys_select(), but for some reason by the time
> the signal dispatch code would run, the signal is cleared.
> 
> I always found this scheme a little fishy, relying on signal
> pending state to guarentee the fixup of the syscall restart
> error return values.
> 
> For example, I see nothing that prevents another thread sharing
> this signal state from clearing the signal between the syscall
> checking in the other thread and the signal dispatch check running
> in that other thread.
> 
> 	cpu 1			cpu 2
> 	check sigpending
> 				clear signal
> 	syscall return
> 	no signal panding
> 	get -ERESTARTNOHAND
> 
> Perhaps something makes this no happen, but I don't see it :)
This is exactly the theory that I've been tossing around with the person that
reported it to me.  We're still witing on a reproduer, but I'm currently working
on a multithreaded test case to try and trigger user space leaks of
ERESTARTNOHAND to user space.

Neil


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
  2007-01-16 20:13 Neil Horman
  2007-01-22 13:59 ` Paolo Ornati
@ 2007-01-24  5:59 ` David Miller
  2007-01-24 13:21   ` Neil Horman
  1 sibling, 1 reply; 10+ messages in thread
From: David Miller @ 2007-01-24  5:59 UTC (permalink / raw)
  To: nhorman; +Cc: linux-kernel, akpm, torvalds

From: Neil Horman <nhorman@tuxdriver.com>
Date: Tue, 16 Jan 2007 15:13:32 -0500

> As it is currently written, sys_select checks its return code to convert
> ERESTARTNOHAND to EINTR.  However, the check is within an if (tvp) clause, and
> so if select is called from userspace with a NULL timeval, then it is possible
> for the ERESTARTNOHAND errno to leak into userspace, which is incorrect.  This
> patch moves that check outside of the conditional, and prevents the errno leak.
> 
> Signed-Off-By: Neil Horman <nhorman@tuxdriver.com>

As has been probably mentioned, this change is bogus.

core_sys_select() returns -ERESTARTNOHAND in exactly the
case where that return value is legal, when a signal is
pending, so that the signal dispatch code (on return from
the system call) will "fixup" the -ERESTARTNOHAND return
value so that userspace does not see it.

What could be happening is that, somehow, we see the signal
pending in core_sys_select(), but for some reason by the time
the signal dispatch code would run, the signal is cleared.

I always found this scheme a little fishy, relying on signal
pending state to guarentee the fixup of the syscall restart
error return values.

For example, I see nothing that prevents another thread sharing
this signal state from clearing the signal between the syscall
checking in the other thread and the signal dispatch check running
in that other thread.

	cpu 1			cpu 2
	check sigpending
				clear signal
	syscall return
	no signal panding
	get -ERESTARTNOHAND

Perhaps something makes this no happen, but I don't see it :)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
  2007-01-22 16:24       ` Neil Horman
@ 2007-01-23  0:00         ` bert hubert
  0 siblings, 0 replies; 10+ messages in thread
From: bert hubert @ 2007-01-23  0:00 UTC (permalink / raw)
  To: Neil Horman; +Cc: Linus Torvalds, Paolo Ornati, linux-kernel, akpm, torvalds

On Mon, Jan 22, 2007 at 11:24:06AM -0500, Neil Horman wrote:

> The error was reported to me second hand.  I'm expecting a reproducer (although
> to date, I'm still waiting for it, so I may have jumped the gun here).  In fact,

I've asked for a repoducer weeks ago and nothing happened, nobody even
bothered to reply - I wrote a small test-app that might exhibit the problem,
but didn't manage to trigger it.

	Bert
	
-- 
http://www.PowerDNS.com      Open source, database driven DNS Software 
http://netherlabs.nl              Open and Closed source services

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
  2007-01-22 16:03     ` Linus Torvalds
@ 2007-01-22 16:24       ` Neil Horman
  2007-01-23  0:00         ` bert hubert
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Horman @ 2007-01-22 16:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Paolo Ornati, linux-kernel, akpm, torvalds

On Mon, Jan 22, 2007 at 08:03:53AM -0800, Linus Torvalds wrote:
> 
> 
> On Mon, 22 Jan 2007, Neil Horman wrote:
> 
> > On Mon, Jan 22, 2007 at 02:59:56PM +0100, Paolo Ornati wrote:
> > > 
> > > the ERESTARTNOHAND thing is handled in arch specific signal code,
> > 
> > In the signal handling path yes.
> 
> Right.
> 
> > Not always in the case of select, though.  Check core_sys_select:
> 
> No, even in the case of select().
> 
> > if (!ret) {
> >                 ret = -ERESTARTNOHAND;
> >                 if (signal_pending(current))
> >                         goto out;
> >                 ret = 0;
> 
> Since we have "signal_pending(current)" being true, we _know_ that the 
> signal handling path will be triggered, so the ERESTARTNOHAND will be 
> changed into the appropriate error return (or restart) by the signal 
> handling code.
> 
> > Its possible for core_sys_select to return ERESTARTNOHAND to sys_select, which
> > will in turn (as its currently written), return that value back to user space.
> 
> No. Exactly because sys_select() will always return through the system 
> call handling path, and that will turn the ERESTARTNOHAND into something 
> else.
> 
> NOTE! If you use "ptrace()", you may see the internal errors. But that's a 
> ptrace-only thing, and may have fooled you into thinking that the actual 
> _application_ sees those internal errors. It won't.
> 
> Of course, we could have some signal-handling bug here, but if so, it 
> would affect a lot more than just select(). Have you actually seen 
> ERESTARTNOINTR in the app (not just ptrace?)
> 
The error was reported to me second hand.  I'm expecting a reproducer (although
to date, I'm still waiting for it, so I may have jumped the gun here).  In fact,
I see what your saying now, down in the assembly glue for our arches (x86 in
this case) we jump to do_notify_resume since we have a pending signal, and
inside do_signal from there we fix up ERESTARTNOHAND to be something sane for
userspace.  Ok, I withdraw this patch.  I'll repost when/if I get my hands on
the reproducer and see that something is actually slipping through.

Neil

> 		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
  2007-01-22 14:52   ` Neil Horman
@ 2007-01-22 16:03     ` Linus Torvalds
  2007-01-22 16:24       ` Neil Horman
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2007-01-22 16:03 UTC (permalink / raw)
  To: Neil Horman; +Cc: Paolo Ornati, linux-kernel, akpm, torvalds



On Mon, 22 Jan 2007, Neil Horman wrote:

> On Mon, Jan 22, 2007 at 02:59:56PM +0100, Paolo Ornati wrote:
> > 
> > the ERESTARTNOHAND thing is handled in arch specific signal code,
> 
> In the signal handling path yes.

Right.

> Not always in the case of select, though.  Check core_sys_select:

No, even in the case of select().

> if (!ret) {
>                 ret = -ERESTARTNOHAND;
>                 if (signal_pending(current))
>                         goto out;
>                 ret = 0;

Since we have "signal_pending(current)" being true, we _know_ that the 
signal handling path will be triggered, so the ERESTARTNOHAND will be 
changed into the appropriate error return (or restart) by the signal 
handling code.

> Its possible for core_sys_select to return ERESTARTNOHAND to sys_select, which
> will in turn (as its currently written), return that value back to user space.

No. Exactly because sys_select() will always return through the system 
call handling path, and that will turn the ERESTARTNOHAND into something 
else.

NOTE! If you use "ptrace()", you may see the internal errors. But that's a 
ptrace-only thing, and may have fooled you into thinking that the actual 
_application_ sees those internal errors. It won't.

Of course, we could have some signal-handling bug here, but if so, it 
would affect a lot more than just select(). Have you actually seen 
ERESTARTNOINTR in the app (not just ptrace?)

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
  2007-01-22 13:59 ` Paolo Ornati
@ 2007-01-22 14:52   ` Neil Horman
  2007-01-22 16:03     ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Horman @ 2007-01-22 14:52 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: linux-kernel, akpm, torvalds

On Mon, Jan 22, 2007 at 02:59:56PM +0100, Paolo Ornati wrote:
> On Tue, 16 Jan 2007 15:13:32 -0500
> Neil Horman <nhorman@tuxdriver.com> wrote:
> 
> > As it is currently written, sys_select checks its return code to convert
> > ERESTARTNOHAND to EINTR.  However, the check is within an if (tvp) clause, and
> > so if select is called from userspace with a NULL timeval, then it is possible
> > for the ERESTARTNOHAND errno to leak into userspace, which is incorrect.  This
> > patch moves that check outside of the conditional, and prevents the errno leak.
> 
> the ERESTARTNOHAND thing is handled in arch specific signal code,

In the signal handling path yes.
Not always in the case of select, though.  Check core_sys_select:

if (!ret) {
                ret = -ERESTARTNOHAND;
                if (signal_pending(current))
                        goto out;
                ret = 0;
        }
...

out:
        if (bits != stack_fds)
                kfree(bits);
out_nofds:
        return ret;

Its possible for core_sys_select to return ERESTARTNOHAND to sys_select, which
will in turn (as its currently written), return that value back to user space.

> syscalls can return -ERESTARTNOHAND as much as they want (and your
> change breaks the current behaviour of select()).
> 

It doesn't break it, it fixes it.  select isn't meant to ever return
ERESTARTNOHAND to user space:
http://www.opengroup.org/onlinepubs/009695399/functions/select.html

And ENORESTARTHAND isn't defined in the userspace errno.h, so this causes all
sorts of confusion for apps that don't expect it.

Neil

> For example:
> 
> arch/x86_64/kernel/signal.c
> 
>         /* Are we from a system call? */
>         if ((long)regs->orig_rax >= 0) {
>                 /* If so, check system call restarting.. */
>                 switch (regs->rax) {
>                         case -ERESTART_RESTARTBLOCK:
>                         case -ERESTARTNOHAND:
>                                 regs->rax = -EINTR;
>                                 break;
> 
> -- 
> 	Paolo Ornati
> 	Linux 2.6.20-rc5 on x86_64

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
  2007-01-16 20:13 Neil Horman
@ 2007-01-22 13:59 ` Paolo Ornati
  2007-01-22 14:52   ` Neil Horman
  2007-01-24  5:59 ` David Miller
  1 sibling, 1 reply; 10+ messages in thread
From: Paolo Ornati @ 2007-01-22 13:59 UTC (permalink / raw)
  To: Neil Horman; +Cc: linux-kernel, akpm, torvalds, nhorman

On Tue, 16 Jan 2007 15:13:32 -0500
Neil Horman <nhorman@tuxdriver.com> wrote:

> As it is currently written, sys_select checks its return code to convert
> ERESTARTNOHAND to EINTR.  However, the check is within an if (tvp) clause, and
> so if select is called from userspace with a NULL timeval, then it is possible
> for the ERESTARTNOHAND errno to leak into userspace, which is incorrect.  This
> patch moves that check outside of the conditional, and prevents the errno leak.

the ERESTARTNOHAND thing is handled in arch specific signal code,
syscalls can return -ERESTARTNOHAND as much as they want (and your
change breaks the current behaviour of select()).

For example:

arch/x86_64/kernel/signal.c

        /* Are we from a system call? */
        if ((long)regs->orig_rax >= 0) {
                /* If so, check system call restarting.. */
                switch (regs->rax) {
                        case -ERESTART_RESTARTBLOCK:
                        case -ERESTARTNOHAND:
                                regs->rax = -EINTR;
                                break;

-- 
	Paolo Ornati
	Linux 2.6.20-rc5 on x86_64

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace
@ 2007-01-16 20:13 Neil Horman
  2007-01-22 13:59 ` Paolo Ornati
  2007-01-24  5:59 ` David Miller
  0 siblings, 2 replies; 10+ messages in thread
From: Neil Horman @ 2007-01-16 20:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, torvalds, nhorman

As it is currently written, sys_select checks its return code to convert
ERESTARTNOHAND to EINTR.  However, the check is within an if (tvp) clause, and
so if select is called from userspace with a NULL timeval, then it is possible
for the ERESTARTNOHAND errno to leak into userspace, which is incorrect.  This
patch moves that check outside of the conditional, and prevents the errno leak.

Thanks & Regards
Neil

Signed-Off-By: Neil Horman <nhorman@tuxdriver.com>


 select.c |   18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)


diff --git a/fs/select.c b/fs/select.c
index fe0893a..afcd691 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -415,20 +415,12 @@ asmlinkage long sys_select(int n, fd_set __user *inp, fd_set __user *outp,
 		rtv.tv_sec = timeout;
 		if (timeval_compare(&rtv, &tv) >= 0)
 			rtv = tv;
-		if (copy_to_user(tvp, &rtv, sizeof(rtv))) {
-sticky:
-			/*
-			 * If an application puts its timeval in read-only
-			 * memory, we don't want the Linux-specific update to
-			 * the timeval to cause a fault after the select has
-			 * completed successfully. However, because we're not
-			 * updating the timeval, we can't restart the system
-			 * call.
-			 */
-			if (ret == -ERESTARTNOHAND)
-				ret = -EINTR;
-		}
+		if (copy_to_user(tvp, &rtv, sizeof(rtv)))
+			return -EFAULT;
 	}
+sticky:
+	if (ret == -ERESTARTNOHAND)
+		ret = -EINTR;
 
 	return ret;
 }

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-08-17 20:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-17 15:41 [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace John Blackwood
2007-08-17 20:55 ` Neil Horman
  -- strict thread matches above, loose matches on Subject: below --
2007-01-16 20:13 Neil Horman
2007-01-22 13:59 ` Paolo Ornati
2007-01-22 14:52   ` Neil Horman
2007-01-22 16:03     ` Linus Torvalds
2007-01-22 16:24       ` Neil Horman
2007-01-23  0:00         ` bert hubert
2007-01-24  5:59 ` David Miller
2007-01-24 13:21   ` Neil Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).