LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Question about clearing of tsk->robust_list in clone
@ 2011-02-15  6:43 Kenneth Albanowski (Palm GBU)
  2011-02-15 13:16 ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Kenneth Albanowski (Palm GBU) @ 2011-02-15  6:43 UTC (permalink / raw)
  To: linux-kernel

I've been tracking down a bug I ran into with a robust pthreads mutex
shared between a parent and a child that it forked. This finally came
down to a bad interaction between glibc and the kernel (and looks to
be present in the current Linux trees as well as glibc 2.13): 
copy_process() explicitly clears the robust_list pointer in the cloned
child. However, there is currently no logic in nptl to re-establish
the robust list pointer after a fork.

There was some conversation about this in the Fedora bugtracker:

  https://bugzilla.redhat.com/show_bug.cgi?id=628608

Those folks appear to have reached the same conclusion as I: this
could either be solved with some potentially complex glibc code, or by
simply not having the kernel NULL out robust_list in the child.

That code came in at:

    commit 8f17d3a5049d32392b79925c73a0cf99ce6d5af0
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Mon Mar 27 01:16:27 2006 -0800

        [PATCH] lightweight robust futexes updates
    
        - fix: initialize the robust list(s) to NULL in copy_process.
    
        - doc update
    
        - cleanup: rename _inuser to _inatomic
    
        - __user cleanups and other small cleanups
    
Can anyone say what problem was being fixed by initializing the robust
list(s) to NULL? I've stared at the implementation, and I cannot see any
harm (potentially a slight bit more work in exec, but no harm) in not
clearing them.

Test program appended, to demonstrate the issue somewhat concisely.

- Kenneth

-- child_robust_mutex_death.c --
#define _GNU_SOURCE // for ROBUST mutexes from pthread.h
#include <sys/mman.h>
#include <sys/wait.h>
#include <pthread.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>

int main()
{
  // Use an anonymous shared mapping to put the same mutex in child as in parent. Any other
  // sharing technique would work, this is simply easy and doesn't clutter.
  void * region = mmap(NULL, 4096, PROT_READ|PROT_WRITE,MAP_SHARED|MAP_ANONYMOUS, -1, 0);
  pthread_mutex_t * mutex = (pthread_mutex_t*)region;

  pthread_mutexattr_t attr;

  pthread_mutexattr_init(&attr);
  pthread_mutexattr_setrobust_np(&attr, PTHREAD_MUTEX_ROBUST_NP);
  pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
  pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ERRORCHECK);
  pthread_mutex_init(mutex, &attr); 

  if (fork() == 0) {
    // child
    // To demonstrate issue, this lock must occur directly in a fork child which has not exec'd, and
    // not in a thread created via pthread_create after forking.
    pthread_mutex_lock(mutex);

    // We now exit from the child, which should trigger the robust mechanism to clean up the mutex.
    _exit(0);
    
  } else {
    // parent
    wait(NULL); // wait for child to die.
    
    // try to obtain the lock, indicating whether the child still has it. This test could be
    // done from any process with access to the mutex, it is merely convenient to do it from
    // the parent.
      
    int err = pthread_mutex_trylock(mutex);
    if (err == 0) {
      printf("Failed, mutex unlocked, but no EOWNERDEAD -- I don't know what happened.\n");
    } else if (err == EBUSY) {
      printf("Failed, mutex not unlocked -- fork child tsk->robust_list == NULL issue.\n");
    } else if (err == EOWNERDEAD) {
      printf("Success, robust mutex reported owner death as intended.\n");
    }
    _exit(0);
  }
}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about clearing of tsk->robust_list in clone
  2011-02-15  6:43 Question about clearing of tsk->robust_list in clone Kenneth Albanowski (Palm GBU)
@ 2011-02-15 13:16 ` Thomas Gleixner
  2011-02-15 14:00   ` Peter Zijlstra
  2011-02-15 22:51   ` Kenneth Albanowski (Palm GBU)
  0 siblings, 2 replies; 6+ messages in thread
From: Thomas Gleixner @ 2011-02-15 13:16 UTC (permalink / raw)
  To: Kenneth Albanowski (Palm GBU)
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Jakub Jelinek, Andreas Schwab

On Mon, 14 Feb 2011, Kenneth Albanowski (Palm GBU) wrote:

Cc'ed a few folks.

> I've been tracking down a bug I ran into with a robust pthreads mutex
> shared between a parent and a child that it forked. This finally came
> down to a bad interaction between glibc and the kernel (and looks to
> be present in the current Linux trees as well as glibc 2.13): 
> copy_process() explicitly clears the robust_list pointer in the cloned
> child. However, there is currently no logic in nptl to re-establish
> the robust list pointer after a fork.
> 
> There was some conversation about this in the Fedora bugtracker:
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=628608
> 
> Those folks appear to have reached the same conclusion as I: this
> could either be solved with some potentially complex glibc code, or by
> simply not having the kernel NULL out robust_list in the child.

That conclusion is just wrong.

> Can anyone say what problem was being fixed by initializing the robust
> list(s) to NULL? I've stared at the implementation, and I cannot see any
> harm (potentially a slight bit more work in exec, but no harm) in not
> clearing them.

copy_process() is used for a lot more than fork(). The most obvious
example is pthread_create() where we _MUST_ clear it in copy_process()
otherwise an exiting thread would run through the parents
robust_list.

You cannot preserve the pointer for fork either. The simple reason is
that it points to the parents list, which is copied during fork. So:

   mutex_1 = mmap(region1);
   mutex_2 = mmap(region2);  

   pthread_mutex_lock(mutex_1);

   pid = fork();

   if (!pid) {

        pthread_mutex_lock(mutex_2);

	-> mutex_2 is on robust_list after mutex_1

      	do_something();

	exit();

---> Kernel looks at the copied robust list, which says that it holds
     mutex_1 and mutex_2.

     The sanity checks in exit_robust_list will catch that mutex_1
     owner is not matching, but that makes it not more correct.

     Worse, the child might have unmapped region1 because it does not
     use it at all. That will fault in handling the robust list
     mutex_1 entry and then abort the list walk. So mutex_2 remains
     locked. Not brilliant either.

So the robust_list = NULL stays no matter what.

And I do not buy the argument about "complex glibc code" at all. glibc
already handles it for pthread_create() so why the hell can't it
handle it for fork() ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about clearing of tsk->robust_list in clone
  2011-02-15 13:16 ` Thomas Gleixner
@ 2011-02-15 14:00   ` Peter Zijlstra
  2011-02-15 14:19     ` Thomas Gleixner
  2011-02-15 22:51   ` Kenneth Albanowski (Palm GBU)
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2011-02-15 14:00 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kenneth Albanowski (Palm GBU),
	linux-kernel, Ingo Molnar, Jakub Jelinek, Andreas Schwab

On Tue, 2011-02-15 at 14:16 +0100, Thomas Gleixner wrote:
> 
> And I do not buy the argument about "complex glibc code" at all. glibc
> already handles it for pthread_create() so why the hell can't it
> handle it for fork() ? 

Going by comment #9 they think calling sys_set_robust_list() on every
fork() is too expensive for them, but realistically we cannot do
anything about it, ->robust_list is strictly task state.

I also think their suggestion in comment #11 (lazy state) is flawed,
what if the parent never users robust futexes, in that case the state
will indicate not to initialize the robust state for its children, again
leading to the observed wreckage.

Realistically libpthread should register an on_fork() callback to ensure
the state is properly propagated.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about clearing of tsk->robust_list in clone
  2011-02-15 14:00   ` Peter Zijlstra
@ 2011-02-15 14:19     ` Thomas Gleixner
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2011-02-15 14:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Kenneth Albanowski (Palm GBU),
	linux-kernel, Ingo Molnar, Jakub Jelinek, Andreas Schwab

On Tue, 15 Feb 2011, Peter Zijlstra wrote:

> On Tue, 2011-02-15 at 14:16 +0100, Thomas Gleixner wrote:
> > 
> > And I do not buy the argument about "complex glibc code" at all. glibc
> > already handles it for pthread_create() so why the hell can't it
> > handle it for fork() ? 
> 
> Going by comment #9 they think calling sys_set_robust_list() on every
> fork() is too expensive for them, but realistically we cannot do
> anything about it, ->robust_list is strictly task state.

Right. 

Vs. too expensive: We already call set_robust_list() on every
pthread_create and on every process start when libpthread is
linked. So what makes fork() so special? Aside of that
set_robust_list() is hardly an expensive syscall.

> I also think their suggestion in comment #11 (lazy state) is flawed,
> what if the parent never users robust futexes, in that case the state
> will indicate not to initialize the robust state for its children, again
> leading to the observed wreckage.

Yup.
 
> Realistically libpthread should register an on_fork() callback to ensure
> the state is properly propagated.

Makes sense.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Question about clearing of tsk->robust_list in clone
  2011-02-15 13:16 ` Thomas Gleixner
  2011-02-15 14:00   ` Peter Zijlstra
@ 2011-02-15 22:51   ` Kenneth Albanowski (Palm GBU)
  2011-02-15 23:01     ` Thomas Gleixner
  1 sibling, 1 reply; 6+ messages in thread
From: Kenneth Albanowski (Palm GBU) @ 2011-02-15 22:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Jakub Jelinek, Andreas Schwab

On Tue, 15 Feb 2011, Thomas Gleixner wrote:

> The sanity checks in exit_robust_list will catch that mutex_1
> owner is not matching, but that makes it not more correct.

Yes, that's my remaining question: whether the intention was that the
owner filter would often be preventing erroneous unlocks, or whether
there should never be an inappropriate list in normal usage. The owner
check does seem sufficient to prevent mayhem if the list pointer is
copied to the child.

On Tue, 15 Feb 2011, Peter Zijlstra wrote:

> Realistically libpthread should register an on_fork() callback to 
> ensure the state is properly propagated.

Agreed, that seems reasonable, with only the minor impact of an
additional set_robust_list call. That resolves this as a libc issue,
not a kernel issue.

- Kenneth


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Question about clearing of tsk->robust_list in clone
  2011-02-15 22:51   ` Kenneth Albanowski (Palm GBU)
@ 2011-02-15 23:01     ` Thomas Gleixner
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2011-02-15 23:01 UTC (permalink / raw)
  To: Kenneth Albanowski (Palm GBU)
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Jakub Jelinek, Andreas Schwab

On Tue, 15 Feb 2011, Kenneth Albanowski (Palm GBU) wrote:

> On Tue, 15 Feb 2011, Thomas Gleixner wrote:
> 
> > The sanity checks in exit_robust_list will catch that mutex_1
> > owner is not matching, but that makes it not more correct.
> 
> Yes, that's my remaining question: whether the intention was that the
> owner filter would often be preventing erroneous unlocks, or whether
> there should never be an inappropriate list in normal usage. The owner
> check does seem sufficient to prevent mayhem if the list pointer is
> copied to the child.

It's task state, so we cannot keep state around which belongs to some
other task. 

Of course we have nevertheless sanity checks in place as we have no
idea what kind of crap user space hands us as "robust_list" pointer.

Ideally we never need to walk that list at all in the normal exit
case when user space behaves nicely.
 
> On Tue, 15 Feb 2011, Peter Zijlstra wrote:
> 
> > Realistically libpthread should register an on_fork() callback to 
> > ensure the state is properly propagated.
> 
> Agreed, that seems reasonable, with only the minor impact of an
> additional set_robust_list call. That resolves this as a libc issue,
> not a kernel issue.

As I said, I don't understand that argument of the additional call at
all. We do it on every process start and on every pthread_create
already, just fork() was forgotten somehow.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-02-15 23:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-15  6:43 Question about clearing of tsk->robust_list in clone Kenneth Albanowski (Palm GBU)
2011-02-15 13:16 ` Thomas Gleixner
2011-02-15 14:00   ` Peter Zijlstra
2011-02-15 14:19     ` Thomas Gleixner
2011-02-15 22:51   ` Kenneth Albanowski (Palm GBU)
2011-02-15 23:01     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).