LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* futex(2) man page update help request
@ 2014-05-14 10:35 Michael Kerrisk (man-pages)
  2014-05-14 16:18 ` Darren Hart
  0 siblings, 1 reply; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-05-14 10:35 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Jakub Jelinek, Darren Hart
  Cc: mtk.manpages, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API

[So, some futex recent discussions remind me I should make this request]

Hello all (especially those in the To:, namely Thomas, Darren, Ingo, Jakub),

The futex man pages:
http://man7.org/linux/man-pages/man2/futex.2.html
http://man7.org/linux/man-pages/man7/futex.7.html
are currently in a sorry state. I'm by no means convinced that all of the
futex operations described there are explained fully and correctly. And 
probably not all error cases for each operation are properly documented.
I'd be very happy if some folk could review those pages and send me 
corrections (Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages). 

But worse, a number of futex operations remain undocumented in futex(2)
(see the list below).

I am aware of Documentation/pi-futex.txt and 
Documentation/futex-requeue-pi.txt. However, both of those documents
are rather thin on details / explain what certain FUTEX_* operations are
used for rather than what they do / focus on the implementation, rather 
than the semantics.

What I would like is that the futex(2) page documenta each one of 
these operations with a focus on the semantics in a way that might be
useful to writers of library functions or those who simply wish to
better understand (from a user-space perspective) what futexes are 
and how they are used. However, I don't have the knowledge to do 
this well in any reasonable time. 

Would the folk in the To: list (or anyone else who is knowledgeable) 
be willing to write patches
(Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages )
or just supply me with some raw text that documents these currently
undocumented futex operations, in the manner suggested?

The operations for which documentation is currently missing are:

2.6.14 adds FUTEX_WAKE_OP
    commit 4732efbeb997189d9f9b04708dc26bf8613ed721
    Author: Jakub Jelinek <jakub@redhat.com>
    Date:   Tue Sep 6 15:16:25 2005 -0700
    See also https://bugzilla.kernel.org/show_bug.cgi?id=14303

2.6.18 adds priority inheritance support:
FUTEX_LOCK_PI, FUTEX_UNLOCK_PI, and FUTEX_TRYLOCK_PI.  
    commit c87e2837be82df479a6bae9f155c43516d2feebc
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jun 27 02:54:58 2006 -0700

    commit e2970f2fb6950183a34e8545faa093eb49d186e1
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jun 27 02:54:47 2006 -0700

    See Documentation/pi-futex.txt

2.6.25 adds FUTEX_WAKE_BITSET, FUTEX_WAIT_BITSET
    commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
    Author: Thomas Gleixner <tglx@linutronix.de>
    Date:   Fri Feb 1 17:45:14 2008 +0100

2.6.31 adds FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI
    commit 52400ba946759af28442dee6265c5c0180ac7122
    Author: Darren Hart <dvhltc@us.ibm.com>
    Date:   Fri Apr 3 13:40:49 2009 -0700

    commit ba9c22f2c01cf5c88beed5a6b9e07d42e10bd358
    Author: Darren Hart <dvhltc@us.ibm.com>
    Date:   Mon Apr 20 22:22:22 2009 -0700

    See Documentation/futex-requeue-pi.txt

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 10:35 futex(2) man page update help request Michael Kerrisk (man-pages)
@ 2014-05-14 16:18 ` Darren Hart
  2014-05-14 19:03   ` Michael Kerrisk (man-pages)
                     ` (3 more replies)
  0 siblings, 4 replies; 80+ messages in thread
From: Darren Hart @ 2014-05-14 16:18 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages), Thomas Gleixner, Ingo Molnar, Jakub Jelinek
  Cc: linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 5/14/14, 3:35, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
wrote:

>[So, some futex recent discussions remind me I should make this request]
>
>Hello all (especially those in the To:, namely Thomas, Darren, Ingo,
>Jakub),
>
>The futex man pages:
>http://man7.org/linux/man-pages/man2/futex.2.html
>http://man7.org/linux/man-pages/man7/futex.7.html
>are currently in a sorry state. I'm by no means convinced that all of the
>futex operations described there are explained fully and correctly. And
>probably not all error cases for each operation are properly documented.
>I'd be very happy if some folk could review those pages and send me
>corrections (Git: 
>http://git.kernel.org/pub/scm/docs/man-pages/man-pages).
>
>But worse, a number of futex operations remain undocumented in futex(2)
>(see the list below).
>
>I am aware of Documentation/pi-futex.txt and
>Documentation/futex-requeue-pi.txt. However, both of those documents
>are rather thin on details / explain what certain FUTEX_* operations are
>used for rather than what they do / focus on the implementation, rather
>than the semantics.
>
>What I would like is that the futex(2) page documenta each one of
>these operations with a focus on the semantics in a way that might be
>useful to writers of library functions or those who simply wish to
>better understand (from a user-space perspective) what futexes are
>and how they are used. However, I don't have the knowledge to do
>this well in any reasonable time.
>
>Would the folk in the To: list (or anyone else who is knowledgeable)
>be willing to write patches
>(Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages )
>or just supply me with some raw text that documents these currently
>undocumented futex operations, in the manner suggested?

Yes, I'll be glad to help.

However, unless I'm sorely mistaken, the larger problem is that glibc
removed the futex() call entirely, so these man pages don't describe
something users even have access to anymore. I had to revert to calling
the syscalls directly in the futextest test suite because of this:

http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inclu
de/futextest.h#n67


Which basically defines:

#define futex(uaddr, op, val, timeout, uaddr2, val3, opflags) \
        syscall(SYS_futex, uaddr, op | opflags, val, timeout, uaddr2, val3)


Adding Carlos for his perspective.

If I'm wrong, or we can restore the futex() call, great. If not... Should
we keep the man-pages and document it as syscall(SYS_futex, ..., op, ...) ?

Thanks,

Darren

>
>The operations for which documentation is currently missing are:
>
>2.6.14 adds FUTEX_WAKE_OP
>    commit 4732efbeb997189d9f9b04708dc26bf8613ed721
>    Author: Jakub Jelinek <jakub@redhat.com>
>    Date:   Tue Sep 6 15:16:25 2005 -0700
>    See also https://bugzilla.kernel.org/show_bug.cgi?id=14303
>
>2.6.18 adds priority inheritance support:
>FUTEX_LOCK_PI, FUTEX_UNLOCK_PI, and FUTEX_TRYLOCK_PI.
>    commit c87e2837be82df479a6bae9f155c43516d2feebc
>    Author: Ingo Molnar <mingo@elte.hu>
>    Date:   Tue Jun 27 02:54:58 2006 -0700
>
>    commit e2970f2fb6950183a34e8545faa093eb49d186e1
>    Author: Ingo Molnar <mingo@elte.hu>
>    Date:   Tue Jun 27 02:54:47 2006 -0700
>
>    See Documentation/pi-futex.txt
>
>2.6.25 adds FUTEX_WAKE_BITSET, FUTEX_WAIT_BITSET
>    commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
>    Author: Thomas Gleixner <tglx@linutronix.de>
>    Date:   Fri Feb 1 17:45:14 2008 +0100
>
>2.6.31 adds FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI
>    commit 52400ba946759af28442dee6265c5c0180ac7122
>    Author: Darren Hart <dvhltc@us.ibm.com>
>    Date:   Fri Apr 3 13:40:49 2009 -0700
>
>    commit ba9c22f2c01cf5c88beed5a6b9e07d42e10bd358
>    Author: Darren Hart <dvhltc@us.ibm.com>
>    Date:   Mon Apr 20 22:22:22 2009 -0700
>
>    See Documentation/futex-requeue-pi.txt
>
>Thanks,
>
>Michael
>
>
>-- 
>Michael Kerrisk
>Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>Linux/UNIX System Programming Training: http://man7.org/training/
>


-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 16:18 ` Darren Hart
@ 2014-05-14 19:03   ` Michael Kerrisk (man-pages)
  2014-05-14 19:59     ` Darren Hart
                       ` (2 more replies)
  2014-05-14 21:05   ` Davidlohr Bueso
                     ` (2 subsequent siblings)
  3 siblings, 3 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-05-14 19:03 UTC (permalink / raw)
  To: Darren Hart
  Cc: Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

Hi Darren,

On Wed, May 14, 2014 at 6:18 PM, Darren Hart <dvhart@linux.intel.com> wrote:
> On 5/14/14, 3:35, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> wrote:
>
>>[So, some futex recent discussions remind me I should make this request]
>>
>>Hello all (especially those in the To:, namely Thomas, Darren, Ingo,
>>Jakub),
>>
>>The futex man pages:
>>http://man7.org/linux/man-pages/man2/futex.2.html
>>http://man7.org/linux/man-pages/man7/futex.7.html
>>are currently in a sorry state. I'm by no means convinced that all of the
>>futex operations described there are explained fully and correctly. And
>>probably not all error cases for each operation are properly documented.
>>I'd be very happy if some folk could review those pages and send me
>>corrections (Git:
>>http://git.kernel.org/pub/scm/docs/man-pages/man-pages).
>>
>>But worse, a number of futex operations remain undocumented in futex(2)
>>(see the list below).
>>
>>I am aware of Documentation/pi-futex.txt and
>>Documentation/futex-requeue-pi.txt. However, both of those documents
>>are rather thin on details / explain what certain FUTEX_* operations are
>>used for rather than what they do / focus on the implementation, rather
>>than the semantics.
>>
>>What I would like is that the futex(2) page documenta each one of
>>these operations with a focus on the semantics in a way that might be
>>useful to writers of library functions or those who simply wish to
>>better understand (from a user-space perspective) what futexes are
>>and how they are used. However, I don't have the knowledge to do
>>this well in any reasonable time.
>>
>>Would the folk in the To: list (or anyone else who is knowledgeable)
>>be willing to write patches
>>(Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages )
>>or just supply me with some raw text that documents these currently
>>undocumented futex operations, in the manner suggested?
>
> Yes, I'll be glad to help.

Thanks!

> However, unless I'm sorely mistaken, the larger problem is that glibc
> removed the futex() call entirely, so these man pages don't describe

I don't think futex() ever was in glibc--that's by design, and
completely understandable: no user-space application would want to
directly use futex(). (BTW, I mispoke in my earlier mail when I said I
wanted documentation suitable for "writers of library functions" -- I
meant suitable for "writers of *C library*".)

> something users even have access to anymore. I had to revert to calling
> the syscalls directly in the futextest test suite because of this:
>
> http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inclu
> de/futextest.h#n67
>
>
> Which basically defines:
>
> #define futex(uaddr, op, val, timeout, uaddr2, val3, opflags) \
>         syscall(SYS_futex, uaddr, op | opflags, val, timeout, uaddr2, val3)
>
>
> Adding Carlos for his perspective.
>
> If I'm wrong, or we can restore the futex() call, great. If not... Should
> we keep the man-pages and document it as syscall(SYS_futex, ..., op, ...) ?

We should keep the man page ;-). I just need to add some words to
point out the use of syscall(). Mostly though, I'm interested in
getting documentation of the operations listed below :-).

Cheers,

Michael



>>    commit 4732efbeb997189d9f9b04708dc26bf8613ed721
>>    Author: Jakub Jelinek <jakub@redhat.com>
>>    Date:   Tue Sep 6 15:16:25 2005 -0700
>>    See also https://bugzilla.kernel.org/show_bug.cgi?id=14303
>>
>>2.6.18 adds priority inheritance support:
>>FUTEX_LOCK_PI, FUTEX_UNLOCK_PI, and FUTEX_TRYLOCK_PI.
>>    commit c87e2837be82df479a6bae9f155c43516d2feebc
>>    Author: Ingo Molnar <mingo@elte.hu>
>>    Date:   Tue Jun 27 02:54:58 2006 -0700
>>
>>    commit e2970f2fb6950183a34e8545faa093eb49d186e1
>>    Author: Ingo Molnar <mingo@elte.hu>
>>    Date:   Tue Jun 27 02:54:47 2006 -0700
>>
>>    See Documentation/pi-futex.txt
>>
>>2.6.25 adds FUTEX_WAKE_BITSET, FUTEX_WAIT_BITSET
>>    commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
>>    Author: Thomas Gleixner <tglx@linutronix.de>
>>    Date:   Fri Feb 1 17:45:14 2008 +0100
>>
>>2.6.31 adds FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI
>>    commit 52400ba946759af28442dee6265c5c0180ac7122
>>    Author: Darren Hart <dvhltc@us.ibm.com>
>>    Date:   Fri Apr 3 13:40:49 2009 -0700
>>
>>    commit ba9c22f2c01cf5c88beed5a6b9e07d42e10bd358
>>    Author: Darren Hart <dvhltc@us.ibm.com>
>>    Date:   Mon Apr 20 22:22:22 2009 -0700
>>
>>    See Documentation/futex-requeue-pi.txt
>>
>>Thanks,
>>
>>Michael
>>
>>
>>--
>>Michael Kerrisk
>>Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>>Linux/UNIX System Programming Training: http://man7.org/training/
>>
>
>
> --
> Darren Hart                                     Open Source Technology Center
> darren.hart@intel.com                                       Intel Corporation
>
>
>



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 19:03   ` Michael Kerrisk (man-pages)
@ 2014-05-14 19:59     ` Darren Hart
  2014-05-14 20:23     ` Carlos O'Donell
  2014-05-14 20:56     ` Davidlohr Bueso
  2 siblings, 0 replies; 80+ messages in thread
From: Darren Hart @ 2014-05-14 19:59 UTC (permalink / raw)
  To: mtk.manpages
  Cc: Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

On 5/14/14, 12:03, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
wrote:

>Hi Darren,
>
>On Wed, May 14, 2014 at 6:18 PM, Darren Hart <dvhart@linux.intel.com>
>wrote:
>> On 5/14/14, 3:35, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
>> wrote:
>>
>>>[So, some futex recent discussions remind me I should make this request]
>>>
>>>Hello all (especially those in the To:, namely Thomas, Darren, Ingo,
>>>Jakub),
>>>
>>>The futex man pages:
>>>http://man7.org/linux/man-pages/man2/futex.2.html
>>>http://man7.org/linux/man-pages/man7/futex.7.html
>>>are currently in a sorry state. I'm by no means convinced that all of
>>>the
>>>futex operations described there are explained fully and correctly. And
>>>probably not all error cases for each operation are properly documented.
>>>I'd be very happy if some folk could review those pages and send me
>>>corrections (Git:
>>>http://git.kernel.org/pub/scm/docs/man-pages/man-pages).
>>>
>>>But worse, a number of futex operations remain undocumented in futex(2)
>>>(see the list below).
>>>
>>>I am aware of Documentation/pi-futex.txt and
>>>Documentation/futex-requeue-pi.txt. However, both of those documents
>>>are rather thin on details / explain what certain FUTEX_* operations are
>>>used for rather than what they do / focus on the implementation, rather
>>>than the semantics.
>>>
>>>What I would like is that the futex(2) page documenta each one of
>>>these operations with a focus on the semantics in a way that might be
>>>useful to writers of library functions or those who simply wish to
>>>better understand (from a user-space perspective) what futexes are
>>>and how they are used. However, I don't have the knowledge to do
>>>this well in any reasonable time.
>>>
>>>Would the folk in the To: list (or anyone else who is knowledgeable)
>>>be willing to write patches
>>>(Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages )
>>>or just supply me with some raw text that documents these currently
>>>undocumented futex operations, in the manner suggested?
>>
>> Yes, I'll be glad to help.
>
>Thanks!
>
>> However, unless I'm sorely mistaken, the larger problem is that glibc
>> removed the futex() call entirely, so these man pages don't describe
>
>I don't think futex() ever was in glibc--that's by design, and
>completely understandable: no user-space application would want to
>directly use futex(). (BTW, I mispoke in my earlier mail when I said I
>wanted documentation suitable for "writers of library functions" -- I
>meant suitable for "writers of *C library*".)
>
>> something users even have access to anymore. I had to revert to calling
>> the syscalls directly in the futextest test suite because of this:
>>
>> 
>>http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inc
>>lu
>> de/futextest.h#n67
>>
>>
>> Which basically defines:
>>
>> #define futex(uaddr, op, val, timeout, uaddr2, val3, opflags) \
>>         syscall(SYS_futex, uaddr, op | opflags, val, timeout, uaddr2,
>>val3)
>>
>>
>> Adding Carlos for his perspective.
>>
>> If I'm wrong, or we can restore the futex() call, great. If not...
>>Should
>> we keep the man-pages and document it as syscall(SYS_futex, ..., op,
>>...) ?
>
>We should keep the man page ;-). I just need to add some words to
>point out the use of syscall(). Mostly though, I'm interested in
>getting documentation of the operations listed below :-).

OK, yes - because the following doesn't work:

#include <linux/futex.h>
       #include <sys/time.h>

       int futex(int *uaddr, int op, int val, const struct timespec
*timeout,
          int *uaddr2, int val3);


It's going to be difficult for me to make the time to do this - but I do
want it done, so please don't hesitate to nag me.

Thanks,

>>>    commit 4732efbeb997189d9f9b04708dc26bf8613ed721
>>>    Author: Jakub Jelinek <jakub@redhat.com>
>>>    Date:   Tue Sep 6 15:16:25 2005 -0700
>>>    See also https://bugzilla.kernel.org/show_bug.cgi?id=14303
>>>
>>>2.6.18 adds priority inheritance support:
>>>FUTEX_LOCK_PI, FUTEX_UNLOCK_PI, and FUTEX_TRYLOCK_PI.
>>>    commit c87e2837be82df479a6bae9f155c43516d2feebc
>>>    Author: Ingo Molnar <mingo@elte.hu>
>>>    Date:   Tue Jun 27 02:54:58 2006 -0700
>>>
>>>    commit e2970f2fb6950183a34e8545faa093eb49d186e1
>>>    Author: Ingo Molnar <mingo@elte.hu>
>>>    Date:   Tue Jun 27 02:54:47 2006 -0700
>>>
>>>    See Documentation/pi-futex.txt
>>>
>>>2.6.25 adds FUTEX_WAKE_BITSET, FUTEX_WAIT_BITSET
>>>    commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
>>>    Author: Thomas Gleixner <tglx@linutronix.de>
>>>    Date:   Fri Feb 1 17:45:14 2008 +0100
>>>
>>>2.6.31 adds FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI
>>>    commit 52400ba946759af28442dee6265c5c0180ac7122
>>>    Author: Darren Hart <dvhltc@us.ibm.com>
>>>    Date:   Fri Apr 3 13:40:49 2009 -0700
>>>
>>>    commit ba9c22f2c01cf5c88beed5a6b9e07d42e10bd358
>>>    Author: Darren Hart <dvhltc@us.ibm.com>
>>>    Date:   Mon Apr 20 22:22:22 2009 -0700
>>>
>>>    See Documentation/futex-requeue-pi.txt
>>>
>>>Thanks,
>>>
>>>Michael
>>>
>>>
>>>--
>>>Michael Kerrisk
>>>Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>>>Linux/UNIX System Programming Training: http://man7.org/training/
>>>
>>
>>
>> --
>> Darren Hart                                     Open Source Technology
>>Center
>> darren.hart@intel.com                                       Intel
>>Corporation
>>
>>
>>
>
>
>
>-- 
>Michael Kerrisk
>Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>Linux/UNIX System Programming Training: http://man7.org/training/
>


-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 19:03   ` Michael Kerrisk (man-pages)
  2014-05-14 19:59     ` Darren Hart
@ 2014-05-14 20:23     ` Carlos O'Donell
  2014-05-14 20:44       ` Andy Lutomirski
                         ` (3 more replies)
  2014-05-14 20:56     ` Davidlohr Bueso
  2 siblings, 4 replies; 80+ messages in thread
From: Carlos O'Donell @ 2014-05-14 20:23 UTC (permalink / raw)
  To: mtk.manpages, Darren Hart
  Cc: Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API

On 05/14/2014 03:03 PM, Michael Kerrisk (man-pages) wrote:
>> However, unless I'm sorely mistaken, the larger problem is that glibc
>> removed the futex() call entirely, so these man pages don't describe
> 
> I don't think futex() ever was in glibc--that's by design, and
> completely understandable: no user-space application would want to
> directly use futex(). (BTW, I mispoke in my earlier mail when I said I
> wanted documentation suitable for "writers of library functions" -- I
> meant suitable for "writers of *C library*".)

I fully agree with Michael here.

The futex() syscall was never exposed to userspace specifically because
it was an interface we did not want to support forever with a stable ABI.
The futex() syscall is an implementation detail that is shared between
the kernel and the writers of core runtimes for Linux.

The fact that the futex() syscall is out of date is my fault, is the fault
of Linux kernel developers, etc. etc., we should all have reached out to
Michael with patches to keep this developer-centric documentation updated.
 
>> something users even have access to anymore. I had to revert to calling
>> the syscalls directly in the futextest test suite because of this:
>>
>> http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inclu
>> de/futextest.h#n67
>>
>>
>> Which basically defines:
>>
>> #define futex(uaddr, op, val, timeout, uaddr2, val3, opflags) \
>>         syscall(SYS_futex, uaddr, op | opflags, val, timeout, uaddr2, val3)
>>
>>
>> Adding Carlos for his perspective.
>>
>> If I'm wrong, or we can restore the futex() call, great. If not... Should
>> we keep the man-pages and document it as syscall(SYS_futex, ..., op, ...) ?
> 
> We should keep the man page ;-). I just need to add some words to
> point out the use of syscall(). Mostly though, I'm interested in
> getting documentation of the operations listed below :-).

Agreed. The man page should get expanded and should be as detailed as possible
about the interface the kernel is exposing.

There are other syscalls like gettid() that have a:
NOTE: There is no glibc wrapper for this system call; see NOTES.

That's what should happen here with futex() (though NOTES does mention this,
but it's not called out at the top of the man page).

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 20:23     ` Carlos O'Donell
@ 2014-05-14 20:44       ` Andy Lutomirski
  2014-05-14 23:34       ` Thomas Gleixner
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 80+ messages in thread
From: Andy Lutomirski @ 2014-05-14 20:44 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Michael Kerrisk-manpages, Darren Hart, Thomas Gleixner,
	Ingo Molnar, Jakub Jelinek, linux-man, lkml, Davidlohr Bueso,
	Arnd Bergmann, Steven Rostedt, Peter Zijlstra, Linux API

On Wed, May 14, 2014 at 1:23 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 05/14/2014 03:03 PM, Michael Kerrisk (man-pages) wrote:
>>> However, unless I'm sorely mistaken, the larger problem is that glibc
>>> removed the futex() call entirely, so these man pages don't describe
>>
>> I don't think futex() ever was in glibc--that's by design, and
>> completely understandable: no user-space application would want to
>> directly use futex(). (BTW, I mispoke in my earlier mail when I said I
>> wanted documentation suitable for "writers of library functions" -- I
>> meant suitable for "writers of *C library*".)
>
> I fully agree with Michael here.
>
> The futex() syscall was never exposed to userspace specifically because
> it was an interface we did not want to support forever with a stable ABI.
> The futex() syscall is an implementation detail that is shared between
> the kernel and the writers of core runtimes for Linux.
>
> The fact that the futex() syscall is out of date is my fault, is the fault
> of Linux kernel developers, etc. etc., we should all have reached out to
> Michael with patches to keep this developer-centric documentation updated.

I realize that this is out of scope for linux-abi, but I *stongly*
disagree with this notion.  futex() needs to be just as stable as
anything else: old glibc versions must continue to work.  I just
jumped through a bunch of hoops to keep a single glibc patch release
in OpenSUSE 9 working in a maintainable way; breaking futex will break
far more than that.

Additionally, at least the FUTEX_WAIT and FUTEX_WAKE operations are
extremely useful, and they can do things that are tedious at best
using mutexes and condvars.  It's a simple API to use.  I use it, and
I've seen plenty of other open-source apps using the futex API
directly.

I think the best way forward might be to try to convince the glibc
maintainers to add the wrapper.

--Andy

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 19:03   ` Michael Kerrisk (man-pages)
  2014-05-14 19:59     ` Darren Hart
  2014-05-14 20:23     ` Carlos O'Donell
@ 2014-05-14 20:56     ` Davidlohr Bueso
  2014-05-14 21:03       ` Darren Hart
  2014-05-15  0:28       ` H. Peter Anvin
  2 siblings, 2 replies; 80+ messages in thread
From: Davidlohr Bueso @ 2014-05-14 20:56 UTC (permalink / raw)
  To: mtk.manpages
  Cc: Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On Wed, 2014-05-14 at 21:03 +0200, Michael Kerrisk (man-pages) wrote:
> Hi Darren,
> 
> On Wed, May 14, 2014 at 6:18 PM, Darren Hart <dvhart@linux.intel.com> wrote:
> > On 5/14/14, 3:35, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> > wrote:
> >
> >>[So, some futex recent discussions remind me I should make this request]
> >>
> >>Hello all (especially those in the To:, namely Thomas, Darren, Ingo,
> >>Jakub),
> >>
> >>The futex man pages:
> >>http://man7.org/linux/man-pages/man2/futex.2.html
> >>http://man7.org/linux/man-pages/man7/futex.7.html
> >>are currently in a sorry state. I'm by no means convinced that all of the
> >>futex operations described there are explained fully and correctly. And
> >>probably not all error cases for each operation are properly documented.
> >>I'd be very happy if some folk could review those pages and send me
> >>corrections (Git:
> >>http://git.kernel.org/pub/scm/docs/man-pages/man-pages).
> >>
> >>But worse, a number of futex operations remain undocumented in futex(2)
> >>(see the list below).
> >>
> >>I am aware of Documentation/pi-futex.txt and
> >>Documentation/futex-requeue-pi.txt. However, both of those documents
> >>are rather thin on details / explain what certain FUTEX_* operations are
> >>used for rather than what they do / focus on the implementation, rather
> >>than the semantics.
> >>
> >>What I would like is that the futex(2) page documenta each one of
> >>these operations with a focus on the semantics in a way that might be
> >>useful to writers of library functions or those who simply wish to
> >>better understand (from a user-space perspective) what futexes are
> >>and how they are used. However, I don't have the knowledge to do
> >>this well in any reasonable time.
> >>
> >>Would the folk in the To: list (or anyone else who is knowledgeable)
> >>be willing to write patches
> >>(Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages )
> >>or just supply me with some raw text that documents these currently
> >>undocumented futex operations, in the manner suggested?
> >
> > Yes, I'll be glad to help.
> 
> Thanks!
> 
> > However, unless I'm sorely mistaken, the larger problem is that glibc
> > removed the futex() call entirely, so these man pages don't describe
> 
> I don't think futex() ever was in glibc--that's by design, and
> completely understandable: no user-space application would want to
> directly use futex(). 

That's actually not quite true. There are plenty of software efforts out
there that use futex calls directly to implement userspace serialization
mechanisms as an alternative to the bulky sysv semaphores. I worked
closely with an in-memory DB project that makes heavy use of them. Not
everyone can simply rely on pthreads.

> (BTW, I mispoke in my earlier mail when I said I
> wanted documentation suitable for "writers of library functions" -- I
> meant suitable for "writers of *C library*".)

While kind of beyond the scope of documentation like manpages, I would
very much welcome this! The internet is full of
misleading/false/stupid/incomplete references to futexes and how to use
them. Having a best practises helps the kernel directly avoiding users
from misusing them.

Thanks,
Davidlohr



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 20:56     ` Davidlohr Bueso
@ 2014-05-14 21:03       ` Darren Hart
  2014-05-14 22:21         ` Paul E. McKenney
  2014-05-15  0:28       ` H. Peter Anvin
  1 sibling, 1 reply; 80+ messages in thread
From: Darren Hart @ 2014-05-14 21:03 UTC (permalink / raw)
  To: Davidlohr Bueso, mtk.manpages
  Cc: Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell, mathieu.desnoyers,
	Paul E. McKenney

On 5/14/14, 13:56, "Davidlohr Bueso" <davidlohr@hp.com> wrote:

>On Wed, 2014-05-14 at 21:03 +0200, Michael Kerrisk (man-pages) wrote:
>> Hi Darren,
>> 
>> On Wed, May 14, 2014 at 6:18 PM, Darren Hart <dvhart@linux.intel.com>
>>wrote:
>> > On 5/14/14, 3:35, "Michael Kerrisk (man-pages)"
>><mtk.manpages@gmail.com>
>> > wrote:
>> >
>> >>[So, some futex recent discussions remind me I should make this
>>request]
>> >>
>> >>Hello all (especially those in the To:, namely Thomas, Darren, Ingo,
>> >>Jakub),
>> >>
>> >>The futex man pages:
>> >>http://man7.org/linux/man-pages/man2/futex.2.html
>> >>http://man7.org/linux/man-pages/man7/futex.7.html
>> >>are currently in a sorry state. I'm by no means convinced that all of
>>the
>> >>futex operations described there are explained fully and correctly.
>>And
>> >>probably not all error cases for each operation are properly
>>documented.
>> >>I'd be very happy if some folk could review those pages and send me
>> >>corrections (Git:
>> >>http://git.kernel.org/pub/scm/docs/man-pages/man-pages).
>> >>
>> >>But worse, a number of futex operations remain undocumented in
>>futex(2)
>> >>(see the list below).
>> >>
>> >>I am aware of Documentation/pi-futex.txt and
>> >>Documentation/futex-requeue-pi.txt. However, both of those documents
>> >>are rather thin on details / explain what certain FUTEX_* operations
>>are
>> >>used for rather than what they do / focus on the implementation,
>>rather
>> >>than the semantics.
>> >>
>> >>What I would like is that the futex(2) page documenta each one of
>> >>these operations with a focus on the semantics in a way that might be
>> >>useful to writers of library functions or those who simply wish to
>> >>better understand (from a user-space perspective) what futexes are
>> >>and how they are used. However, I don't have the knowledge to do
>> >>this well in any reasonable time.
>> >>
>> >>Would the folk in the To: list (or anyone else who is knowledgeable)
>> >>be willing to write patches
>> >>(Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages )
>> >>or just supply me with some raw text that documents these currently
>> >>undocumented futex operations, in the manner suggested?
>> >
>> > Yes, I'll be glad to help.
>> 
>> Thanks!
>> 
>> > However, unless I'm sorely mistaken, the larger problem is that glibc
>> > removed the futex() call entirely, so these man pages don't describe
>> 
>> I don't think futex() ever was in glibc--that's by design, and
>> completely understandable: no user-space application would want to
>> directly use futex().
>
>That's actually not quite true. There are plenty of software efforts out
>there that use futex calls directly to implement userspace serialization
>mechanisms as an alternative to the bulky sysv semaphores. I worked
>closely with an in-memory DB project that makes heavy use of them. Not
>everyone can simply rely on pthreads.

Userspace RCU from Mattieu Desnoyers / Efficios uses it as well - or did.

>
>> (BTW, I mispoke in my earlier mail when I said I
>> wanted documentation suitable for "writers of library functions" -- I
>> meant suitable for "writers of *C library*".)
>
>While kind of beyond the scope of documentation like manpages, I would
>very much welcome this! The internet is full of
>misleading/false/stupid/incomplete references to futexes and how to use
>them. Having a best practises helps the kernel directly avoiding users
>from misusing them.

Agreed.

-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 16:18 ` Darren Hart
  2014-05-14 19:03   ` Michael Kerrisk (man-pages)
@ 2014-05-14 21:05   ` Davidlohr Bueso
  2014-05-15 15:15     ` Joseph S. Myers
  2014-05-15  0:18   ` H. Peter Anvin
  2014-05-15 15:28   ` chrubis
  3 siblings, 1 reply; 80+ messages in thread
From: Davidlohr Bueso @ 2014-05-14 21:05 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

On Wed, 2014-05-14 at 09:18 -0700, Darren Hart wrote:
> On 5/14/14, 3:35, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> wrote:
> 
> >[So, some futex recent discussions remind me I should make this request]
> >
> >Hello all (especially those in the To:, namely Thomas, Darren, Ingo,
> >Jakub),
> >
> >The futex man pages:
> >http://man7.org/linux/man-pages/man2/futex.2.html
> >http://man7.org/linux/man-pages/man7/futex.7.html
> >are currently in a sorry state. I'm by no means convinced that all of the
> >futex operations described there are explained fully and correctly. And
> >probably not all error cases for each operation are properly documented.
> >I'd be very happy if some folk could review those pages and send me
> >corrections (Git: 
> >http://git.kernel.org/pub/scm/docs/man-pages/man-pages).
> >
> >But worse, a number of futex operations remain undocumented in futex(2)
> >(see the list below).
> >
> >I am aware of Documentation/pi-futex.txt and
> >Documentation/futex-requeue-pi.txt. However, both of those documents
> >are rather thin on details / explain what certain FUTEX_* operations are
> >used for rather than what they do / focus on the implementation, rather
> >than the semantics.
> >
> >What I would like is that the futex(2) page documenta each one of
> >these operations with a focus on the semantics in a way that might be
> >useful to writers of library functions or those who simply wish to
> >better understand (from a user-space perspective) what futexes are
> >and how they are used. However, I don't have the knowledge to do
> >this well in any reasonable time.
> >
> >Would the folk in the To: list (or anyone else who is knowledgeable)
> >be willing to write patches
> >(Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages )
> >or just supply me with some raw text that documents these currently
> >undocumented futex operations, in the manner suggested?
> 
> Yes, I'll be glad to help.
> 
> However, unless I'm sorely mistaken, the larger problem is that glibc
> removed the futex() call entirely, so these man pages don't describe
> something users even have access to anymore. I had to revert to calling
> the syscalls directly in the futextest test suite because of this:
> 
> http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inclu
> de/futextest.h#n67
> 
> 
> Which basically defines:
> 
> #define futex(uaddr, op, val, timeout, uaddr2, val3, opflags) \
>         syscall(SYS_futex, uaddr, op | opflags, val, timeout, uaddr2, val3)

Yeah, and I actually stole that from you for perf :)

> 
> Adding Carlos for his perspective.
> 
> If I'm wrong, or we can restore the futex() call, great. If not... Should
> we keep the man-pages and document it as syscall(SYS_futex, ..., op, ...) ?

+1, is there anything preventing adding a futex wrapper... glibc folks?


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 21:03       ` Darren Hart
@ 2014-05-14 22:21         ` Paul E. McKenney
  0 siblings, 0 replies; 80+ messages in thread
From: Paul E. McKenney @ 2014-05-14 22:21 UTC (permalink / raw)
  To: Darren Hart
  Cc: Davidlohr Bueso, mtk.manpages, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API, Carlos O'Donell,
	mathieu.desnoyers

On Wed, May 14, 2014 at 02:03:13PM -0700, Darren Hart wrote:
> On 5/14/14, 13:56, "Davidlohr Bueso" <davidlohr@hp.com> wrote:
> 
> >On Wed, 2014-05-14 at 21:03 +0200, Michael Kerrisk (man-pages) wrote:
> >> Hi Darren,
> >> 
> >> On Wed, May 14, 2014 at 6:18 PM, Darren Hart <dvhart@linux.intel.com>
> >>wrote:
> >> > On 5/14/14, 3:35, "Michael Kerrisk (man-pages)"
> >><mtk.manpages@gmail.com>
> >> > wrote:
> >> >
> >> >>[So, some futex recent discussions remind me I should make this
> >>request]
> >> >>
> >> >>Hello all (especially those in the To:, namely Thomas, Darren, Ingo,
> >> >>Jakub),
> >> >>
> >> >>The futex man pages:
> >> >>http://man7.org/linux/man-pages/man2/futex.2.html
> >> >>http://man7.org/linux/man-pages/man7/futex.7.html
> >> >>are currently in a sorry state. I'm by no means convinced that all of
> >>the
> >> >>futex operations described there are explained fully and correctly.
> >>And
> >> >>probably not all error cases for each operation are properly
> >>documented.
> >> >>I'd be very happy if some folk could review those pages and send me
> >> >>corrections (Git:
> >> >>http://git.kernel.org/pub/scm/docs/man-pages/man-pages).
> >> >>
> >> >>But worse, a number of futex operations remain undocumented in
> >>futex(2)
> >> >>(see the list below).
> >> >>
> >> >>I am aware of Documentation/pi-futex.txt and
> >> >>Documentation/futex-requeue-pi.txt. However, both of those documents
> >> >>are rather thin on details / explain what certain FUTEX_* operations
> >>are
> >> >>used for rather than what they do / focus on the implementation,
> >>rather
> >> >>than the semantics.
> >> >>
> >> >>What I would like is that the futex(2) page documenta each one of
> >> >>these operations with a focus on the semantics in a way that might be
> >> >>useful to writers of library functions or those who simply wish to
> >> >>better understand (from a user-space perspective) what futexes are
> >> >>and how they are used. However, I don't have the knowledge to do
> >> >>this well in any reasonable time.
> >> >>
> >> >>Would the folk in the To: list (or anyone else who is knowledgeable)
> >> >>be willing to write patches
> >> >>(Git: http://git.kernel.org/pub/scm/docs/man-pages/man-pages )
> >> >>or just supply me with some raw text that documents these currently
> >> >>undocumented futex operations, in the manner suggested?
> >> >
> >> > Yes, I'll be glad to help.
> >> 
> >> Thanks!
> >> 
> >> > However, unless I'm sorely mistaken, the larger problem is that glibc
> >> > removed the futex() call entirely, so these man pages don't describe
> >> 
> >> I don't think futex() ever was in glibc--that's by design, and
> >> completely understandable: no user-space application would want to
> >> directly use futex().
> >
> >That's actually not quite true. There are plenty of software efforts out
> >there that use futex calls directly to implement userspace serialization
> >mechanisms as an alternative to the bulky sysv semaphores. I worked
> >closely with an in-memory DB project that makes heavy use of them. Not
> >everyone can simply rely on pthreads.
> 
> Userspace RCU from Mattieu Desnoyers / Efficios uses it as well - or did.

It still does.  ;-)

							Thanx, Paul

> >> (BTW, I mispoke in my earlier mail when I said I
> >> wanted documentation suitable for "writers of library functions" -- I
> >> meant suitable for "writers of *C library*".)
> >
> >While kind of beyond the scope of documentation like manpages, I would
> >very much welcome this! The internet is full of
> >misleading/false/stupid/incomplete references to futexes and how to use
> >them. Having a best practises helps the kernel directly avoiding users
> >from misusing them.
> 
> Agreed.
> 
> -- 
> Darren Hart					Open Source Technology Center
> darren.hart@intel.com				            Intel Corporation
> 
> 
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 20:23     ` Carlos O'Donell
  2014-05-14 20:44       ` Andy Lutomirski
@ 2014-05-14 23:34       ` Thomas Gleixner
  2014-05-15  3:12         ` Carlos O'Donell
  2014-05-15  4:53         ` Michael Kerrisk (man-pages)
  2014-05-15  8:13       ` Peter Zijlstra
  2014-05-15  8:14       ` Peter Zijlstra
  3 siblings, 2 replies; 80+ messages in thread
From: Thomas Gleixner @ 2014-05-14 23:34 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: mtk.manpages, Darren Hart, Ingo Molnar, Jakub Jelinek, linux-man,
	lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API

On Wed, 14 May 2014, Carlos O'Donell wrote:

> On 05/14/2014 03:03 PM, Michael Kerrisk (man-pages) wrote:
> >> However, unless I'm sorely mistaken, the larger problem is that glibc
> >> removed the futex() call entirely, so these man pages don't describe
> > 
> > I don't think futex() ever was in glibc--that's by design, and
> > completely understandable: no user-space application would want to
> > directly use futex(). (BTW, I mispoke in my earlier mail when I said I
> > wanted documentation suitable for "writers of library functions" -- I
> > meant suitable for "writers of *C library*".)
> 
> I fully agree with Michael here.
> 
> The futex() syscall was never exposed to userspace specifically because
> it was an interface we did not want to support forever with a stable ABI.
> The futex() syscall is an implementation detail that is shared between
> the kernel and the writers of core runtimes for Linux.

Nonsense. 

If we change that interface (aside of adding functionality or some new
error return) it would break the world and some more, simply because
out of the blue glibc-2.xx would stop to work on linux-3.yy.

Aside of that the futex syscall is used as a bare interface without
any glibc interaction:

 - It's handy to implement user space wait queues

 - It's (ab)used in very interesting ways by data base apps

 - It's (ab)used by some Java monstrosities.

Nothing you care about and you really don't want to see the gory
details, but you have to accept that there is an universe which is
happy to deal with the raw syscalls instead of going through some ill
defined posix interfaces.

Thanks,

	tglx
 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 16:18 ` Darren Hart
  2014-05-14 19:03   ` Michael Kerrisk (man-pages)
  2014-05-14 21:05   ` Davidlohr Bueso
@ 2014-05-15  0:18   ` H. Peter Anvin
  2014-05-15  5:21     ` Darren Hart
  2014-05-15 15:35     ` chrubis
  2014-05-15 15:28   ` chrubis
  3 siblings, 2 replies; 80+ messages in thread
From: H. Peter Anvin @ 2014-05-15  0:18 UTC (permalink / raw)
  To: Darren Hart, Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek
  Cc: linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 05/14/2014 09:18 AM, Darren Hart wrote:
> 
> However, unless I'm sorely mistaken, the larger problem is that glibc
> removed the futex() call entirely, so these man pages don't describe
> something users even have access to anymore. I had to revert to calling
> the syscalls directly in the futextest test suite because of this:
> 
> http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inclu
> de/futextest.h#n67
> 

This really comes down to the fact that we should have a libinux which
contains the basic system call wrapper machinery for Linux specific
things and nothing else.

syscall(3) is toxic and breaks randomly on some platforms.

	-hpa



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 20:56     ` Davidlohr Bueso
  2014-05-14 21:03       ` Darren Hart
@ 2014-05-15  0:28       ` H. Peter Anvin
  2014-05-15  0:35         ` Andy Lutomirski
  2014-05-15 19:10         ` Carlos O'Donell
  1 sibling, 2 replies; 80+ messages in thread
From: H. Peter Anvin @ 2014-05-15  0:28 UTC (permalink / raw)
  To: Davidlohr Bueso, mtk.manpages
  Cc: Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 05/14/2014 01:56 PM, Davidlohr Bueso wrote:
>>
>>> However, unless I'm sorely mistaken, the larger problem is that glibc
>>> removed the futex() call entirely, so these man pages don't describe
>>
>> I don't think futex() ever was in glibc--that's by design, and
>> completely understandable: no user-space application would want to
>> directly use futex(). 
> 
> That's actually not quite true. There are plenty of software efforts out
> there that use futex calls directly to implement userspace serialization
> mechanisms as an alternative to the bulky sysv semaphores. I worked
> closely with an in-memory DB project that makes heavy use of them. Not
> everyone can simply rely on pthreads.
> 

More fundamentally, futex(2), like clone(2), are things that can be
legitimately by user space without automatically breaking all of glibc.
 There are some other things where that is *not* true, because glibc
relies on being able to mediate all accesses to a kernel facility, but
not here.

	-hpa


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  0:28       ` H. Peter Anvin
@ 2014-05-15  0:35         ` Andy Lutomirski
  2014-05-15  0:41           ` H. Peter Anvin
  2014-05-15 19:10         ` Carlos O'Donell
  1 sibling, 1 reply; 80+ messages in thread
From: Andy Lutomirski @ 2014-05-15  0:35 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Davidlohr Bueso, Michael Kerrisk-manpages, Darren Hart,
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

On Wed, May 14, 2014 at 5:28 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 05/14/2014 01:56 PM, Davidlohr Bueso wrote:
>>>
>>>> However, unless I'm sorely mistaken, the larger problem is that glibc
>>>> removed the futex() call entirely, so these man pages don't describe
>>>
>>> I don't think futex() ever was in glibc--that's by design, and
>>> completely understandable: no user-space application would want to
>>> directly use futex().
>>
>> That's actually not quite true. There are plenty of software efforts out
>> there that use futex calls directly to implement userspace serialization
>> mechanisms as an alternative to the bulky sysv semaphores. I worked
>> closely with an in-memory DB project that makes heavy use of them. Not
>> everyone can simply rely on pthreads.
>>
>
> More fundamentally, futex(2), like clone(2), are things that can be
> legitimately by user space without automatically breaking all of glibc.

I'm lost -- I think the missing verb is important :)

>  There are some other things where that is *not* true, because glibc
> relies on being able to mediate all accesses to a kernel facility, but
> not here.

--Andy

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  0:35         ` Andy Lutomirski
@ 2014-05-15  0:41           ` H. Peter Anvin
  0 siblings, 0 replies; 80+ messages in thread
From: H. Peter Anvin @ 2014-05-15  0:41 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Davidlohr Bueso, Michael Kerrisk-manpages, Darren Hart,
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

On 05/14/2014 05:35 PM, Andy Lutomirski wrote:
>>
>> More fundamentally, futex(2), like clone(2), are things that can be
>> legitimately by user space without automatically breaking all of glibc.
> 
> I'm lost -- I think the missing verb is important :)
> 

... legitimately *used* by user space ...

As in you can use it to implement your own, non-POSIX, synchronization
primitives.

	-hpa


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 23:34       ` Thomas Gleixner
@ 2014-05-15  3:12         ` Carlos O'Donell
  2014-05-15  4:49           ` Michael Kerrisk (man-pages)
  2014-05-15  4:53         ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 80+ messages in thread
From: Carlos O'Donell @ 2014-05-15  3:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mtk.manpages, Darren Hart, Ingo Molnar, Jakub Jelinek, linux-man,
	lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API

On 05/14/2014 07:34 PM, Thomas Gleixner wrote:
> On Wed, 14 May 2014, Carlos O'Donell wrote:
> 
>> On 05/14/2014 03:03 PM, Michael Kerrisk (man-pages) wrote:
>>>> However, unless I'm sorely mistaken, the larger problem is that glibc
>>>> removed the futex() call entirely, so these man pages don't describe
>>>
>>> I don't think futex() ever was in glibc--that's by design, and
>>> completely understandable: no user-space application would want to
>>> directly use futex(). (BTW, I mispoke in my earlier mail when I said I
>>> wanted documentation suitable for "writers of library functions" -- I
>>> meant suitable for "writers of *C library*".)
>>
>> I fully agree with Michael here.
>>
>> The futex() syscall was never exposed to userspace specifically because
>> it was an interface we did not want to support forever with a stable ABI.
>> The futex() syscall is an implementation detail that is shared between
>> the kernel and the writers of core runtimes for Linux.
> 
> Nonsense. 

What is nonsense?

I do not want to be responsible for the futex API by having glibc provide
wrappers. That can't be nonsense since it's a glibc community decision to
make.

Perhaps the point at which we disagree is that I said "writers of core runtimes"
and you would rather I have said "any application wishing to use raw syscalls."
That's fine, I concede that point, I have no right to restrict raw syscall
usage.
 
> If we change that interface (aside of adding functionality or some new
> error return) it would break the world and some more, simply because
> out of the blue glibc-2.xx would stop to work on linux-3.yy.

No disagreement from me.

> Aside of that the futex syscall is used as a bare interface without
> any glibc interaction:
> 
>  - It's handy to implement user space wait queues
> 
>  - It's (ab)used in very interesting ways by data base apps
> 
>  - It's (ab)used by some Java monstrosities.
> 
> Nothing you care about and you really don't want to see the gory
> details, but you have to accept that there is an universe which is
> happy to deal with the raw syscalls instead of going through some ill
> defined posix interfaces.

Sure :-)

Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  3:12         ` Carlos O'Donell
@ 2014-05-15  4:49           ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-05-15  4:49 UTC (permalink / raw)
  To: Carlos O'Donell, Thomas Gleixner
  Cc: mtk.manpages, Darren Hart, Ingo Molnar, Jakub Jelinek, linux-man,
	lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API

On 05/15/2014 05:12 AM, Carlos O'Donell wrote:
> On 05/14/2014 07:34 PM, Thomas Gleixner wrote:
>> On Wed, 14 May 2014, Carlos O'Donell wrote:
>>
>>> On 05/14/2014 03:03 PM, Michael Kerrisk (man-pages) wrote:
>>>>> However, unless I'm sorely mistaken, the larger problem is that glibc
>>>>> removed the futex() call entirely, so these man pages don't describe
>>>>
>>>> I don't think futex() ever was in glibc--that's by design, and
>>>> completely understandable: no user-space application would want to
>>>> directly use futex(). (BTW, I mispoke in my earlier mail when I said I
>>>> wanted documentation suitable for "writers of library functions" -- I
>>>> meant suitable for "writers of *C library*".)
>>>
>>> I fully agree with Michael here.
>>>
>>> The futex() syscall was never exposed to userspace specifically because
>>> it was an interface we did not want to support forever with a stable ABI.
>>> The futex() syscall is an implementation detail that is shared between
>>> the kernel and the writers of core runtimes for Linux.
>>
>> Nonsense. 
> 
> What is nonsense?

I suspect there's a misunderstanding between worlds here. Thomas means
that the kernel ABI is stable. You mean, glibc does not want to have to
export an ABI that you have to support.

> I do not want to be responsible for the futex API by having glibc provide
> wrappers. That can't be nonsense since it's a glibc community decision to
> make.

See my above.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 23:34       ` Thomas Gleixner
  2014-05-15  3:12         ` Carlos O'Donell
@ 2014-05-15  4:53         ` Michael Kerrisk (man-pages)
  2014-05-15 14:14           ` Thomas Gleixner
  1 sibling, 1 reply; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-05-15  4:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Carlos O'Donell, Darren Hart, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API

Hi Thomas,

On Thu, May 15, 2014 at 1:34 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 14 May 2014, Carlos O'Donell wrote:
>
>> On 05/14/2014 03:03 PM, Michael Kerrisk (man-pages) wrote:
>> >> However, unless I'm sorely mistaken, the larger problem is that glibc
>> >> removed the futex() call entirely, so these man pages don't describe
>> >
>> > I don't think futex() ever was in glibc--that's by design, and
>> > completely understandable: no user-space application would want to
>> > directly use futex(). (BTW, I mispoke in my earlier mail when I said I
>> > wanted documentation suitable for "writers of library functions" -- I
>> > meant suitable for "writers of *C library*".)
>>
>> I fully agree with Michael here.
>>
>> The futex() syscall was never exposed to userspace specifically because
>> it was an interface we did not want to support forever with a stable ABI.
>> The futex() syscall is an implementation detail that is shared between
>> the kernel and the writers of core runtimes for Linux.
>
> Nonsense.
>
> If we change that interface (aside of adding functionality or some new
> error return) it would break the world and some more, simply because
> out of the blue glibc-2.xx would stop to work on linux-3.yy.
>
> Aside of that the futex syscall is used as a bare interface without
> any glibc interaction:
>
>  - It's handy to implement user space wait queues
>
>  - It's (ab)used in very interesting ways by data base apps
>
>  - It's (ab)used by some Java monstrosities.

Thanks for the education about user-space uses of futexes. I was unaware.

> Nothing you care about and you really don't want to see the gory
> details, but you have to accept that there is an universe which is
> happy to deal with the raw syscalls instead of going through some ill
> defined posix interfaces.

And that universe would love to have your documentation of
FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  0:18   ` H. Peter Anvin
@ 2014-05-15  5:21     ` Darren Hart
  2014-05-15  8:23       ` Peter Zijlstra
  2014-05-15 13:46       ` Michael Kerrisk (man-pages)
  2014-05-15 15:35     ` chrubis
  1 sibling, 2 replies; 80+ messages in thread
From: Darren Hart @ 2014-05-15  5:21 UTC (permalink / raw)
  To: H. Peter Anvin, Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek
  Cc: linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 5/14/14, 17:18, "H. Peter Anvin" <hpa@zytor.com> wrote:

>On 05/14/2014 09:18 AM, Darren Hart wrote:
>> 
>> However, unless I'm sorely mistaken, the larger problem is that glibc
>> removed the futex() call entirely, so these man pages don't describe
>> something users even have access to anymore. I had to revert to calling
>> the syscalls directly in the futextest test suite because of this:
>> 
>> 
>>http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inc
>>lu
>> de/futextest.h#n67
>> 
>
>This really comes down to the fact that we should have a libinux which
>contains the basic system call wrapper machinery for Linux specific
>things and nothing else.
>
>syscall(3) is toxic and breaks randomly on some platforms.

Peter Z and I have had a good time discussing this in the past.... And
here it is again. :-)


-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 20:23     ` Carlos O'Donell
  2014-05-14 20:44       ` Andy Lutomirski
  2014-05-14 23:34       ` Thomas Gleixner
@ 2014-05-15  8:13       ` Peter Zijlstra
  2014-05-15 15:43         ` Darren Hart
  2014-05-15  8:14       ` Peter Zijlstra
  3 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2014-05-15  8:13 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: mtk.manpages, Darren Hart, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Linux API

[-- Attachment #1: Type: text/plain, Size: 1201 bytes --]

On Wed, May 14, 2014 at 04:23:38PM -0400, Carlos O'Donell wrote:
> On 05/14/2014 03:03 PM, Michael Kerrisk (man-pages) wrote:
> >> However, unless I'm sorely mistaken, the larger problem is that glibc
> >> removed the futex() call entirely, so these man pages don't describe
> > 
> > I don't think futex() ever was in glibc--that's by design, and
> > completely understandable: no user-space application would want to
> > directly use futex(). (BTW, I mispoke in my earlier mail when I said I
> > wanted documentation suitable for "writers of library functions" -- I
> > meant suitable for "writers of *C library*".)
> 
> I fully agree with Michael here.
> 
> The futex() syscall was never exposed to userspace specifically because
> it was an interface we did not want to support forever with a stable ABI.
> The futex() syscall is an implementation detail that is shared between
> the kernel and the writers of core runtimes for Linux.

That ship has sailed.. for one we must always support old glibc which
uses the futex() syscall, and secondly there are known other programs
that actually use the futex syscall.

So that's really a non-argument, we're hard tied to the ABI.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 20:23     ` Carlos O'Donell
                         ` (2 preceding siblings ...)
  2014-05-15  8:13       ` Peter Zijlstra
@ 2014-05-15  8:14       ` Peter Zijlstra
  2014-05-15 13:18         ` Carlos O'Donell
  3 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2014-05-15  8:14 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: mtk.manpages, Darren Hart, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Linux API

[-- Attachment #1: Type: text/plain, Size: 290 bytes --]

On Wed, May 14, 2014 at 04:23:38PM -0400, Carlos O'Donell wrote:
> There are other syscalls like gettid() that have a:
> NOTE: There is no glibc wrapper for this system call; see NOTES.

Yes, can we finally fix that please? It gets tedious having to endlessly
copy/paste that thing around.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  5:21     ` Darren Hart
@ 2014-05-15  8:23       ` Peter Zijlstra
  2014-05-15 13:46       ` Michael Kerrisk (man-pages)
  1 sibling, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2014-05-15  8:23 UTC (permalink / raw)
  To: Darren Hart
  Cc: H. Peter Anvin, Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Linux API,
	Carlos O'Donell

[-- Attachment #1: Type: text/plain, Size: 1444 bytes --]

On Wed, May 14, 2014 at 10:21:52PM -0700, Darren Hart wrote:
> On 5/14/14, 17:18, "H. Peter Anvin" <hpa@zytor.com> wrote:
> 
> >On 05/14/2014 09:18 AM, Darren Hart wrote:
> >> 
> >> However, unless I'm sorely mistaken, the larger problem is that glibc
> >> removed the futex() call entirely, so these man pages don't describe
> >> something users even have access to anymore. I had to revert to calling
> >> the syscalls directly in the futextest test suite because of this:
> >> 
> >> 
> >>http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inc
> >>lu
> >> de/futextest.h#n67
> >> 
> >
> >This really comes down to the fact that we should have a libinux which
> >contains the basic system call wrapper machinery for Linux specific
> >things and nothing else.
> >
> >syscall(3) is toxic and breaks randomly on some platforms.
> 
> Peter Z and I have had a good time discussing this in the past.... And
> here it is again. :-)

Oh but we wanted _way_ more than bare syscalls in there ;-)

For a start we wanted to make the vDSO a proper DSO that gets included
in the (dynamic) link chain.

/sys/lib/libdso{32,64}.so like

That would also allow all those archs that expose raw dso function
pointers for things like cmpxchg or memory barriers to just provide
platform functions instead, far more usable.

And yes, we wanted to hijack libpthread in order to finally fix the
futex mess :-)


[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  8:14       ` Peter Zijlstra
@ 2014-05-15 13:18         ` Carlos O'Donell
  2014-05-15 13:22           ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Carlos O'Donell @ 2014-05-15 13:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mtk.manpages, Darren Hart, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Linux API

On 05/15/2014 04:14 AM, Peter Zijlstra wrote:
> On Wed, May 14, 2014 at 04:23:38PM -0400, Carlos O'Donell wrote:
>> There are other syscalls like gettid() that have a:
>> NOTE: There is no glibc wrapper for this system call; see NOTES.
> 
> Yes, can we finally fix that please? It gets tedious having to endlessly
> copy/paste that thing around.

What exactly would you like fixed?

Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 13:18         ` Carlos O'Donell
@ 2014-05-15 13:22           ` Peter Zijlstra
  2014-05-15 13:49             ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2014-05-15 13:22 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: mtk.manpages, Darren Hart, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Linux API

[-- Attachment #1: Type: text/plain, Size: 508 bytes --]

On Thu, May 15, 2014 at 09:18:22AM -0400, Carlos O'Donell wrote:
> On 05/15/2014 04:14 AM, Peter Zijlstra wrote:
> > On Wed, May 14, 2014 at 04:23:38PM -0400, Carlos O'Donell wrote:
> >> There are other syscalls like gettid() that have a:
> >> NOTE: There is no glibc wrapper for this system call; see NOTES.
> > 
> > Yes, can we finally fix that please? It gets tedious having to endlessly
> > copy/paste that thing around.
> 
> What exactly would you like fixed?

Not having gettid() in glibc.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  5:21     ` Darren Hart
  2014-05-15  8:23       ` Peter Zijlstra
@ 2014-05-15 13:46       ` Michael Kerrisk (man-pages)
  2014-05-15 14:59         ` H. Peter Anvin
                           ` (2 more replies)
  1 sibling, 3 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-05-15 13:46 UTC (permalink / raw)
  To: Darren Hart, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Jakub Jelinek
  Cc: mtk.manpages, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API, Carlos O'Donell

On 05/15/2014 07:21 AM, Darren Hart wrote:
> On 5/14/14, 17:18, "H. Peter Anvin" <hpa@zytor.com> wrote:
> 
>> On 05/14/2014 09:18 AM, Darren Hart wrote:
>>>
>>> However, unless I'm sorely mistaken, the larger problem is that glibc
>>> removed the futex() call entirely, so these man pages don't describe
>>> something users even have access to anymore. I had to revert to calling
>>> the syscalls directly in the futextest test suite because of this:
>>>
>>>
>>> http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inc
>>> lu
>>> de/futextest.h#n67
>>>
>>
>> This really comes down to the fact that we should have a libinux which
>> contains the basic system call wrapper machinery for Linux specific
>> things and nothing else.
>>
>> syscall(3) is toxic and breaks randomly on some platforms.
> 
> Peter Z and I have had a good time discussing this in the past.... And
> here it is again. :-)

People have a number of times noted that there are problems
with syscall(), but I'm not knowledgeable on the details.
I'd happily take a patch to the man page (which, for historical
reasons, is actually syscall(2)) that explains the the problems 
(and ideally notes those platforms where there are no problems).

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 13:22           ` Peter Zijlstra
@ 2014-05-15 13:49             ` Michael Kerrisk (man-pages)
  2014-05-15 13:55               ` Peter Zijlstra
  2014-05-15 14:39               ` Carlos O'Donell
  0 siblings, 2 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-05-15 13:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Carlos O'Donell, Darren Hart, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Linux API

On Thu, May 15, 2014 at 3:22 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, May 15, 2014 at 09:18:22AM -0400, Carlos O'Donell wrote:
>> On 05/15/2014 04:14 AM, Peter Zijlstra wrote:
>> > On Wed, May 14, 2014 at 04:23:38PM -0400, Carlos O'Donell wrote:
>> >> There are other syscalls like gettid() that have a:
>> >> NOTE: There is no glibc wrapper for this system call; see NOTES.
>> >
>> > Yes, can we finally fix that please? It gets tedious having to endlessly
>> > copy/paste that thing around.
>>
>> What exactly would you like fixed?
>
> Not having gettid() in glibc.

Get in the line ;-).
http://sourceware.org/bugzilla/show_bug.cgi?id=6399

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 13:49             ` Michael Kerrisk (man-pages)
@ 2014-05-15 13:55               ` Peter Zijlstra
  2014-05-15 14:39               ` Carlos O'Donell
  1 sibling, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2014-05-15 13:55 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Carlos O'Donell, Darren Hart, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Linux API

[-- Attachment #1: Type: text/plain, Size: 865 bytes --]

On Thu, May 15, 2014 at 03:49:10PM +0200, Michael Kerrisk (man-pages) wrote:
> On Thu, May 15, 2014 at 3:22 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Thu, May 15, 2014 at 09:18:22AM -0400, Carlos O'Donell wrote:
> >> On 05/15/2014 04:14 AM, Peter Zijlstra wrote:
> >> > On Wed, May 14, 2014 at 04:23:38PM -0400, Carlos O'Donell wrote:
> >> >> There are other syscalls like gettid() that have a:
> >> >> NOTE: There is no glibc wrapper for this system call; see NOTES.
> >> >
> >> > Yes, can we finally fix that please? It gets tedious having to endlessly
> >> > copy/paste that thing around.
> >>
> >> What exactly would you like fixed?
> >
> > Not having gettid() in glibc.
> 
> Get in the line ;-).
> http://sourceware.org/bugzilla/show_bug.cgi?id=6399

Oh hey, it moved.. :-) Hadn't seen the comments since the 2008
time-frame.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  4:53         ` Michael Kerrisk (man-pages)
@ 2014-05-15 14:14           ` Thomas Gleixner
  2014-05-15 20:19             ` Michael Kerrisk (man-pages)
                               ` (2 more replies)
  0 siblings, 3 replies; 80+ messages in thread
From: Thomas Gleixner @ 2014-05-15 14:14 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Carlos O'Donell, Darren Hart, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API

On Thu, 15 May 2014, Michael Kerrisk (man-pages) wrote:
> And that universe would love to have your documentation of
> FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),

I give you almost the full treatment, but I leave REQUEUE_PI to Darren
and FUTEX_WAKE_OP to Jakub. :)


FUTEX_WAIT

	< Existing blurb seems ok >

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

	[EINVAL] The supplied timeout argument is not normalized.

	[EWOULDBLOCK] The atomic enqueueing failed. User space value
		      at uaddr is not equal val argument.

	[ETIMEDOUT] timeout expired 


FUTEX_WAKE

	< Existing blurb seems ok >

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI

FUTEX_REQUEUE

	Existing blurb seems ok , except for this:

	The argument val contains the number of waiters on uaddr which
	are immediately woken up.

	The timeout argument is abused to transport the number of
	waiters which are requeued to the futex at uaddr2. The pointer
	is typecasted to u32.


	[EFAULT] Kernel was unable to access the futex value at uaddr or uaddr2

	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
		 valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI on uaddr

	[EINVAL] uaddr equal uaddr2. Requeue to same futex.

FUTEX_REQUEUE_CMP

	Existing blurb seems ok , except for this:

	The argument val is contains the number of waiters on uaddr
	which are immediately woken up.

	The timeout argument is abused to transport the number of
	waiters which are requeued to the futex at uaddr2. The pointer
	is typecasted to u32.

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr or uaddr2

	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
		 valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] uaddr equal uaddr2. Requeue to same futex.

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI on uaddr

	[EAGAIN] uaddr1 readout is not equal the compare value in
		 argument val3

FUTEX_WAKE_OP


Jakub, can you please explain it? I'm lost :)


	The argument val contains the maximum number of waiters on
	uaddr which are immediately woken up.

	The timeout argument is abused to transport the maximum
	number of waiters on uaddr2 which are woken up. The pointer
	is typecasted to u32.

	Related return values

	[EFAULT] Kernel was unable to access the futex values at uaddr
		 or uaddr2

	[EINVAL] The supplied uaddr or uaddr2 argument does not point
		 to a valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI on uaddr


FUTEX_WAIT_BITSET

	The same as FUTEX_WAIT except that val3 is used to provide a
	32bit bitset to the kernel. This bitset is stored in the
	kernel internal state of the waiter.

	This futex op also allows to have the option bit
	FUTEX_CLOCK_REALTIME set.

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

  	[EINVAL] The supplied bitset is zero.

	[EINVAL] The supplied timeout argument is not normalized.

	[ETIMEDOUT] timeout expired 


FUTEX_WAKE_BITSET

	The same as FUTEX_WAKE except that val3 is used to provide a
	32bit bitset to the kernel. This bitset is used to select
	waiters on the futex. The selection is done by a bitwise AND
	of the wake side supplied bitset and the bitset which is
	stored in the kernel internal state of the waiters. If the
	result is non zero, the waiter is woken, otherwise left
	waiting.

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

  	[EINVAL] The supplied bitset is zero.

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI

FUTEX_LOCK_PI

	This operation reads from the futex address provided by the
	uaddr argument, which contains the namespace specific TID of
	the lock owner. If the TID is 0, then the kernel tries to set
	the waiters TID atomically. If the TID is nonzero or the take
	over fails the kernel sets atomically the FUTEX_WAITERS bit
	which signals the owner, that it cannot unlock the futex in
	user space atomically by transitioning from TID to 0. After
	that the kernel tries to find the task which is associated to
	the owner TID, creates or reuses kernel state on behalf of the
	owner and attaches the waiter to it. The enqueing of the
	waiter is in descending priority order if more than one waiter
	exists. The owner inherits either the priority or the
	bandwidth of the waiter. This inheritance follows the lock
	chain in the case of nested locking and performs deadlock
	detection.

	The timeout argument is handled as described in FUTEX_WAIT.
	The arguments uaddr2, val, and val3 are ignored.

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[ENOMEM] Kernel could not allocate state

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

	[EINVAL] The supplied timeout argument is not normalized.
		 
	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state. Thats
		 either state corruption or it found a waiter on uaddr
		 which is waiting on FUTEX_WAIT[_BITSET]

	[EPERM]  Caller is not allowed to attach itself to the futex.
		 Can be a legitimate issue or a hint for state
		 corruption in user space

	[ESRCH]	 The TID in the user space value does not exist

	[EAGAIN] The futex owner TID is about to exit, but has not yet
		 handled the internal state cleanup. Try again.	 

	[ETIMEDOUT] timeout expired 

	[EDEADLOCK] The futex is already locked by the caller or the kernel
		    detected a deadlock scenario in a nested lock chain

	[EOWNERDIED] The owner of the futex died and the kernel made the
		     caller the new owner. The kernel sets the
		     FUTEX_OWNER_DIED bit in the futex userspace value.
		     Caller is responsible for cleanup

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants  (runtime detection)
		     
FUTEX_TRYLOCK_PI

	This operation tries to acquire the futex at uaddr. It deals
	with the situation where the TID value at uaddr is 0, but the
	FUTEX_HAS_WAITER bit is set. User space cannot handle this
	race free.

	The arguments uaddr2, val, timeout and val3 are ignored.

	Return values:

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[ENOMEM] Kernel could not allocate state

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

	[EINVAL] The kernel detected inconsistent state between the user
		 space state at uaddr and the kernel state

	[EPERM]  Caller is not allowed to attach itself to the futex.
		 Can be a legitimate issue or a hint for state
		 corruption in user space

	[ESRCH]	 The TID in the user space value does not exist

	[EAGAIN] The futex owner TID is about to exit, but has not yet
		 handled the internal state cleanup. Try again.	 

	[EDEADLOCK] The futex is already locked by the caller.

	[EOWNERDIED] The owner of the futex died and the kernel made the
		     caller the new owner. The kernel sets the
		     FUTEX_OWNER_DIED bit in the futex userspace value.
		     Caller is responsible for cleanup

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants (runtime detection)

FUTEX_UNLOCK_PI

	This operation wakes the top priority waiter which is waiting
	in FUTEX_LOCK_PI on the futex address provided by the uaddr
	argument.

	This is called when the user space value at uaddr cannot be
	changed atomically from TID (of the owner) to 0.

	The arguments uaddr2, val, timeout and val3 are ignored.

	Related return values:
	
	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_WAIT[_BITSET].

	[EPERM]  Caller does not own the futex.

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants (runtime detection)

FUTEX_WAIT_REQUEUE_PI

	Wait operation to wait on a non pi futex at uaddr and
	potentially be requeued on a pi futex at uaddr2. The wait
	operation on uaddr is the same as FUTEX_WAIT. The waiter can
	be removed from the wait on uaddr via FUTEX_WAKE without
	requeuing on uaddr2.

	The timeout argument is handled as described in FUTEX_WAIT.

Darren, can you fill in the missing details?

	Return values:

	[EFAULT] Kernel was unable to access the futex value at uaddr
		 or uaddr2

	[EINVAL] The supplied uaddr or uaddr2 argument does not point
		 to a valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] The supplied timeout argument is not normalized.

  	[EINVAL] The supplied bitset is zero.

	[EWOULDBLOCK] The atomic enqueueing failed. User space value
		      at uaddr is not equal val argument.

	[ETIMEDOUT] timeout expired 

	[EOWNERDIED] The owner of the PI futex at uaddr2 died and the
		     kernel made the caller the new owner. The kernel
		     sets the FUTEX_OWNER_DIED bit in the uaddr2 futex
		     userspace value.  Caller is responsible for
		     cleanup

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants (runtime detection)


FUTEX_CMP_REQUEUE_PI

	PI aware variant of FUTEX_CMP_REQUEUE. Inner futex at uaddr is
	a non PI futex. Outer futex to which is requeued is a PI futex
	at uaddr2.

	The waiters on uaddr must wait in FUTEX_WAIT_REQUEUE_PI.

	The argument val is contains the number of waiters on uaddr
	which are immediately woken up. Must be 1 for this opcode.

	The timeout argument is abused to transport the number of
	waiters which are requeued on to the futex at uaddr2. The
	pointer is typecasted to u32.

Darren, can you fill in the missing details?

	[EFAULT] Kernel was unable to access the futex value at uaddr
		 or uaddr2

	[ENOMEM] Kernel could not allocate state

	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
		 valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] uaddr equal uaddr2. Requeue to same futex.

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI on uaddr

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_WAIT[_BITSET] on uaddr

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr2 and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_WAIT on uaddr2.

  	[EINVAL] The supplied bitset is zero.

	[EAGAIN] uaddr1 readout is not equal the compare value in
		 argument val3

	[EAGAIN] The futex owner TID of uaddr2 is about to exit, but
		 has not yet handled the internal state cleanup. Try
		 again.

	[EPERM]  Caller is not allowed to attach the waiter to the
		 futex at uaddr2 Can be a legitimate issue or a hint
		 for state corruption in user space

	[ESRCH]	 The TID in the user space value at uaddr2 does not exist

	[EDEADLOCK] The requeuing of a waiter to the kernel representation
		    of the PI futex at uaddr2 detected a deadlock scenario.

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants (runtime detection)


The various option bits seem to be undocumented as well

FUTEX_PRIVATE_FLAG

	This option bit can be ored on all futex ops.

	It tells the kernel, that the futex is process private and not
	shared with another process. That allows the kernel to chose
	the fast path for validating the user space address and avoids
	expensive VMA lookup, taking refcounts on file backing store
	etc.

FUTEX_CLOCK_REALTIME

	This option bit can be ored on the futex ops FUTEX_WAIT_BITSET
	and FUTEX_WAIT_REQUEUE_PI

	If set the kernel treats the user space supplied timeout as
	absolute time based on CLOCK_REALTIME.

	If not set the kernel treats the user space supplied timeout
	as relative time.

	If this is set on any other op than the supported ones, kernel
	returns ENOSYS!


Thanks,

	tglx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 13:49             ` Michael Kerrisk (man-pages)
  2014-05-15 13:55               ` Peter Zijlstra
@ 2014-05-15 14:39               ` Carlos O'Donell
  2014-05-15 15:11                 ` Peter Zijlstra
  1 sibling, 1 reply; 80+ messages in thread
From: Carlos O'Donell @ 2014-05-15 14:39 UTC (permalink / raw)
  To: mtk.manpages, Peter Zijlstra
  Cc: Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Linux API

On 05/15/2014 09:49 AM, Michael Kerrisk (man-pages) wrote:
> On Thu, May 15, 2014 at 3:22 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> On Thu, May 15, 2014 at 09:18:22AM -0400, Carlos O'Donell wrote:
>>> On 05/15/2014 04:14 AM, Peter Zijlstra wrote:
>>>> On Wed, May 14, 2014 at 04:23:38PM -0400, Carlos O'Donell wrote:
>>>>> There are other syscalls like gettid() that have a:
>>>>> NOTE: There is no glibc wrapper for this system call; see NOTES.
>>>>
>>>> Yes, can we finally fix that please? It gets tedious having to endlessly
>>>> copy/paste that thing around.
>>>
>>> What exactly would you like fixed?
>>
>> Not having gettid() in glibc.
> 
> Get in the line ;-).
> http://sourceware.org/bugzilla/show_bug.cgi?id=6399

I have no objections to this, but I absolutely object to this without
someone documenting and gathering consensus for consistent terminology
to be used between the kernel and glibc.

The relevant comment is here:
https://sourceware.org/bugzilla/show_bug.cgi?id=6399#c26

I'd like to see a glibc manual patch for the threads.texi file, which
can be completely linux-specific, to document gettid() and nomenclature.
It should talk about the nomenclature used to discuss these interfaces
and explain when it is or isn't valid to use a task id and with what 
functions.

For example does gettid *really* return a pid_t as considered by
userspace? It's not a full out process...

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 13:46       ` Michael Kerrisk (man-pages)
@ 2014-05-15 14:59         ` H. Peter Anvin
  2014-05-15 15:42         ` chrubis
  2014-05-15 15:47         ` Darren Hart
  2 siblings, 0 replies; 80+ messages in thread
From: H. Peter Anvin @ 2014-05-15 14:59 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages),
	Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek
  Cc: linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 05/15/2014 06:46 AM, Michael Kerrisk (man-pages) wrote:
> 
> People have a number of times noted that there are problems
> with syscall(), but I'm not knowledgeable on the details.
> I'd happily take a patch to the man page (which, for historical
> reasons, is actually syscall(2)) that explains the the problems 
> (and ideally notes those platforms where there are no problems).
> 

It has to do with how ABIs deal with doublewidth arguments.

There is a reason why Linux syscall ABIs generally have a 1:1 mapping
with the user space ABIs, and why the system call argument is passed not
in the first argument but in a different place (usually a separately
clobbered register, e.g. %eax on x86-64).

On some platforms, doublewidth registers have to be aligned in register
pairs.  On some other platforms, enough arguments mean some will be
passed in memory, where they are forced to be aligned, or they are not
allowed to straddle the register-memory boundary.  All of this means
that padding words might be introduced, and they will be introduced in
the wrong place because of the additional argument introduced at the
beginning of the argument sequence.

On the other hand, the old SYSCALL user-space macros just plain didn't
handle doubleword arguments.

	-hpa


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 14:39               ` Carlos O'Donell
@ 2014-05-15 15:11                 ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2014-05-15 15:11 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: mtk.manpages, Darren Hart, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Linux API

[-- Attachment #1: Type: text/plain, Size: 448 bytes --]

On Thu, May 15, 2014 at 10:39:09AM -0400, Carlos O'Donell wrote:
> For example does gettid *really* return a pid_t as considered by
> userspace? It's not a full out process...

Yeah, PIDs and TIDs are the same namespace in the kernel. All we have
are tasks and each task has an id. gettid() actually returns the id of
the current task.

getpid() returns the id of the thread group leader, so for that task
gettid() and getpid() return the same id.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 21:05   ` Davidlohr Bueso
@ 2014-05-15 15:15     ` Joseph S. Myers
  0 siblings, 0 replies; 80+ messages in thread
From: Joseph S. Myers @ 2014-05-15 15:15 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Darren Hart, Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

On Wed, 14 May 2014, Davidlohr Bueso wrote:

> > If I'm wrong, or we can restore the futex() call, great. If not... Should
> > we keep the man-pages and document it as syscall(SYS_futex, ..., op, ...) ?
> 
> +1, is there anything preventing adding a futex wrapper... glibc folks?

See what I said at 
<https://sourceware.org/bugzilla/show_bug.cgi?id=9712#c4> (with references 
to previous discussions).  Someone needs to take the lead on pushing to 
consensus the question of what syscalls should have wrappers in glibc, and 
then implement the conclusions.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-14 16:18 ` Darren Hart
                     ` (2 preceding siblings ...)
  2014-05-15  0:18   ` H. Peter Anvin
@ 2014-05-15 15:28   ` chrubis
  2014-05-15 15:40     ` Steven Rostedt
  2014-05-15 16:14     ` Darren Hart
  3 siblings, 2 replies; 80+ messages in thread
From: chrubis @ 2014-05-15 15:28 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

Hi!
> 
> However, unless I'm sorely mistaken, the larger problem is that glibc
> removed the futex() call entirely, so these man pages don't describe
> something users even have access to anymore. I had to revert to calling
> the syscalls directly in the futextest test suite because of this:
> 
> http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inclu

So there actually exists some tests for futexes, I've been asked if we
have these as a LTP[1] maintainer several times.

Are these tests executed regulary as a part of some automated framework?
If not it would make sense to port them to LTP (looking at the code that
would be quite easy task) and get them executed by several QA
departments for free. What do you think?

[1] http://linux-test-project.github.io/

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  0:18   ` H. Peter Anvin
  2014-05-15  5:21     ` Darren Hart
@ 2014-05-15 15:35     ` chrubis
  1 sibling, 0 replies; 80+ messages in thread
From: chrubis @ 2014-05-15 15:35 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Darren Hart, Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

Hi!
> > However, unless I'm sorely mistaken, the larger problem is that glibc
> > removed the futex() call entirely, so these man pages don't describe
> > something users even have access to anymore. I had to revert to calling
> > the syscalls directly in the futextest test suite because of this:
> > 
> > http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inclu
> > de/futextest.h#n67
> > 
> 
> This really comes down to the fact that we should have a libinux which
> contains the basic system call wrapper machinery for Linux specific
> things and nothing else.
> 
> syscall(3) is toxic and breaks randomly on some platforms.

+1

And while cleaning the LTP[1] testcases, we are slowly extracting the
special cases into commont code.

[1] http://linux-test-project.github.io/

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 15:28   ` chrubis
@ 2014-05-15 15:40     ` Steven Rostedt
  2014-05-15 16:14     ` Darren Hart
  1 sibling, 0 replies; 80+ messages in thread
From: Steven Rostedt @ 2014-05-15 15:40 UTC (permalink / raw)
  To: chrubis
  Cc: Darren Hart, Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Peter Zijlstra, Linux API,
	Carlos O'Donell

On Thu, 15 May 2014 17:28:35 +0200
chrubis@suse.cz wrote:

> Hi!
> > 
> > However, unless I'm sorely mistaken, the larger problem is that glibc
> > removed the futex() call entirely, so these man pages don't describe
> > something users even have access to anymore. I had to revert to calling
> > the syscalls directly in the futextest test suite because of this:
> > 
> > http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inclu
> 
> So there actually exists some tests for futexes, I've been asked if we
> have these as a LTP[1] maintainer several times.
> 
> Are these tests executed regulary as a part of some automated framework?
> If not it would make sense to port them to LTP (looking at the code that
> would be quite easy task) and get them executed by several QA
> departments for free. What do you think?
> 
> [1] http://linux-test-project.github.io/
> 

I think Thomas may be working on one. If not, I'd be happy to start
writing one as well.

-- Steve

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 13:46       ` Michael Kerrisk (man-pages)
  2014-05-15 14:59         ` H. Peter Anvin
@ 2014-05-15 15:42         ` chrubis
  2014-05-15 15:52           ` H. Peter Anvin
  2014-05-15 15:47         ` Darren Hart
  2 siblings, 1 reply; 80+ messages in thread
From: chrubis @ 2014-05-15 15:42 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Darren Hart, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API, Carlos O'Donell

Hi!
> People have a number of times noted that there are problems
> with syscall(), but I'm not knowledgeable on the details.
> I'd happily take a patch to the man page (which, for historical
> reasons, is actually syscall(2)) that explains the the problems 
> (and ideally notes those platforms where there are no problems).

Have a look at this commit that tries to deal with passing 64 bit
numbers to syscalls. On 32 bit ABI (but not on X32) these needs to be
split up (accordingly to machine endianity).

https://github.com/linux-test-project/ltp/commit/04afb02b4280a20c262054e8f99a3fad4ad54916

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  8:13       ` Peter Zijlstra
@ 2014-05-15 15:43         ` Darren Hart
  0 siblings, 0 replies; 80+ messages in thread
From: Darren Hart @ 2014-05-15 15:43 UTC (permalink / raw)
  To: Peter Zijlstra, Carlos O'Donell
  Cc: mtk.manpages, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Linux API

On 5/15/14, 1:13, "Peter Zijlstra" <peterz@infradead.org> wrote:

>On Wed, May 14, 2014 at 04:23:38PM -0400, Carlos O'Donell wrote:
>> On 05/14/2014 03:03 PM, Michael Kerrisk (man-pages) wrote:
>> >> However, unless I'm sorely mistaken, the larger problem is that glibc
>> >> removed the futex() call entirely, so these man pages don't describe
>> > 
>> > I don't think futex() ever was in glibc--that's by design, and
>> > completely understandable: no user-space application would want to
>> > directly use futex(). (BTW, I mispoke in my earlier mail when I said I
>> > wanted documentation suitable for "writers of library functions" -- I
>> > meant suitable for "writers of *C library*".)
>> 
>> I fully agree with Michael here.
>> 
>> The futex() syscall was never exposed to userspace specifically because
>> it was an interface we did not want to support forever with a stable
>>ABI.
>> The futex() syscall is an implementation detail that is shared between
>> the kernel and the writers of core runtimes for Linux.
>
>That ship has sailed.. for one we must always support old glibc which
>uses the futex() syscall, and secondly there are known other programs
>that actually use the futex syscall.
>
>So that's really a non-argument, we're hard tied to the ABI.

Indeed. This is specifically why FUTEX_REQUEUE still exists (despite it's
bugs) when only FUTEX_CMP_REQUEUE should ever be used in new programs.


-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 13:46       ` Michael Kerrisk (man-pages)
  2014-05-15 14:59         ` H. Peter Anvin
  2014-05-15 15:42         ` chrubis
@ 2014-05-15 15:47         ` Darren Hart
  2 siblings, 0 replies; 80+ messages in thread
From: Darren Hart @ 2014-05-15 15:47 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages),
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Jakub Jelinek
  Cc: linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 5/15/14, 6:46, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
wrote:

>On 05/15/2014 07:21 AM, Darren Hart wrote:
>> On 5/14/14, 17:18, "H. Peter Anvin" <hpa@zytor.com> wrote:
>> 
>>> On 05/14/2014 09:18 AM, Darren Hart wrote:
>>>>
>>>> However, unless I'm sorely mistaken, the larger problem is that glibc
>>>> removed the futex() call entirely, so these man pages don't describe
>>>> something users even have access to anymore. I had to revert to
>>>>calling
>>>> the syscalls directly in the futextest test suite because of this:
>>>>
>>>>
>>>> 
>>>>http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/i
>>>>nc
>>>> lu
>>>> de/futextest.h#n67
>>>>
>>>
>>> This really comes down to the fact that we should have a libinux which
>>> contains the basic system call wrapper machinery for Linux specific
>>> things and nothing else.
>>>
>>> syscall(3) is toxic and breaks randomly on some platforms.
>> 
>> Peter Z and I have had a good time discussing this in the past.... And
>> here it is again. :-)
>
>People have a number of times noted that there are problems
>with syscall(), but I'm not knowledgeable on the details.
>I'd happily take a patch to the man page (which, for historical
>reasons, is actually syscall(2)) that explains the the problems
>(and ideally notes those platforms where there are no problems).


>From my perspective, a named interface with specific documented interfaces
is far more usable than a vargs direct syscall. That just leaves all kinds
of room for error - which of course is why we all write our own wrappers
in our apps rather than use it directly... If we all do it, it seems to me
that is a strong indicator we should provide it in some kind of common
library. Maybe that's libc... Maybe that's libnix...
-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 15:42         ` chrubis
@ 2014-05-15 15:52           ` H. Peter Anvin
  2014-05-15 16:01             ` chrubis
  0 siblings, 1 reply; 80+ messages in thread
From: H. Peter Anvin @ 2014-05-15 15:52 UTC (permalink / raw)
  To: chrubis, Michael Kerrisk (man-pages)
  Cc: Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 05/15/2014 08:42 AM, chrubis@suse.cz wrote:
> Hi!
>> People have a number of times noted that there are problems
>> with syscall(), but I'm not knowledgeable on the details.
>> I'd happily take a patch to the man page (which, for historical
>> reasons, is actually syscall(2)) that explains the the problems 
>> (and ideally notes those platforms where there are no problems).
> 
> Have a look at this commit that tries to deal with passing 64 bit
> numbers to syscalls. On 32 bit ABI (but not on X32) these needs to be
> split up (accordingly to machine endianity).
> 
> https://github.com/linux-test-project/ltp/commit/04afb02b4280a20c262054e8f99a3fad4ad54916
> 

That is wrong, too.  That assumes that there will never be padding
words, which isn't true in the general case, either.

I really believe the proper fix is to use assembly syscall stubs.  In
klibc I build a fairly elaborate machinery to autogenerate such syscall
stubs for a variety of architectures.

	-hpa


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 15:52           ` H. Peter Anvin
@ 2014-05-15 16:01             ` chrubis
  2014-05-15 16:07               ` H. Peter Anvin
  0 siblings, 1 reply; 80+ messages in thread
From: chrubis @ 2014-05-15 16:01 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michael Kerrisk (man-pages),
	Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

Hi!
> > Have a look at this commit that tries to deal with passing 64 bit
> > numbers to syscalls. On 32 bit ABI (but not on X32) these needs to be
> > split up (accordingly to machine endianity).
> > 
> > https://github.com/linux-test-project/ltp/commit/04afb02b4280a20c262054e8f99a3fad4ad54916
> > 
> 
> That is wrong, too.  That assumes that there will never be padding
> words, which isn't true in the general case, either.

Well, it's still far better than the mess we had previously and it works
in most of the cases. However I would love to fix these correctly once
for all.

> I really believe the proper fix is to use assembly syscall stubs.  In
> klibc I build a fairly elaborate machinery to autogenerate such syscall
> stubs for a variety of architectures.

Then it would be nice to share these between klibc and LTP (and possible
everybody else).

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 16:01             ` chrubis
@ 2014-05-15 16:07               ` H. Peter Anvin
  2014-05-15 16:17                 ` chrubis
  0 siblings, 1 reply; 80+ messages in thread
From: H. Peter Anvin @ 2014-05-15 16:07 UTC (permalink / raw)
  To: chrubis
  Cc: Michael Kerrisk (man-pages),
	Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 05/15/2014 09:01 AM, chrubis@suse.cz wrote:
> 
>> I really believe the proper fix is to use assembly syscall stubs.  In
>> klibc I build a fairly elaborate machinery to autogenerate such syscall
>> stubs for a variety of architectures.
> 
> Then it would be nice to share these between klibc and LTP (and possible
> everybody else).
> 

It should be quite easy to extract from klibc.

	-hpa


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 15:28   ` chrubis
  2014-05-15 15:40     ` Steven Rostedt
@ 2014-05-15 16:14     ` Darren Hart
  2014-05-15 16:30       ` chrubis
  1 sibling, 1 reply; 80+ messages in thread
From: Darren Hart @ 2014-05-15 16:14 UTC (permalink / raw)
  To: chrubis
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

On 5/15/14, 8:28, "chrubis@suse.cz" <chrubis@suse.cz> wrote:

>Hi!
>> 
>> However, unless I'm sorely mistaken, the larger problem is that glibc
>> removed the futex() call entirely, so these man pages don't describe
>> something users even have access to anymore. I had to revert to calling
>> the syscalls directly in the futextest test suite because of this:
>> 
>> 
>>http://git.kernel.org/cgit/linux/kernel/git/dvhart/futextest.git/tree/inc
>>lu
>
>So there actually exists some tests for futexes, I've been asked if we
>have these as a LTP[1] maintainer several times.
>
>Are these tests executed regulary as a part of some automated framework?
>If not it would make sense to port them to LTP (looking at the code that
>would be quite easy task) and get them executed by several QA
>departments for free. What do you think?
>
>[1] http://linux-test-project.github.io/

I've used LTP in the past (quite a bit), and I felt there was some
advantage to keeping futextest independent. Perhaps things have changed
enough since then (~2009 era) that we should reconsider. We can discuss
the pros/cons there if you like. I have agreed to move the performance
related tests over to perf, and Davidlohr has added some other such tests
to perf. Trinity now covers the planned fuzz testing for futexes (very
well... Obviously) so that idea will be dropped, leaving pure functional
tests in futextest.

-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 16:07               ` H. Peter Anvin
@ 2014-05-15 16:17                 ` chrubis
  2014-05-15 16:56                   ` H. Peter Anvin
  0 siblings, 1 reply; 80+ messages in thread
From: chrubis @ 2014-05-15 16:17 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michael Kerrisk (man-pages),
	Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

Hi!
> >> I really believe the proper fix is to use assembly syscall stubs.  In
> >> klibc I build a fairly elaborate machinery to autogenerate such syscall
> >> stubs for a variety of architectures.
> > 
> > Then it would be nice to share these between klibc and LTP (and possible
> > everybody else).
> > 
> 
> It should be quite easy to extract from klibc.

That is not the main concern here. If I extract the code I would have to
watch for any changes manually. If it was in a library or a separate
repository all that would be needed is to add it as dependency/git
submodule and I would get all updates automatically.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 16:14     ` Darren Hart
@ 2014-05-15 16:30       ` chrubis
  2014-05-15 18:17         ` Darren Hart
  0 siblings, 1 reply; 80+ messages in thread
From: chrubis @ 2014-05-15 16:30 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

Hi!
> I've used LTP in the past (quite a bit), and I felt there was some
> advantage to keeping futextest independent.

What advantages did you have in mind?

> Perhaps things have changed enough since then (~2009 era) that we
> should reconsider.

I've been working on LTP for a about three years now and we happen to do
quite a lot in that time. The most visible changes would be more proper
development practices (git, proper build system, code review, LKML
coding style, documentation, ...) and also huge number of fixes. Now we
are trying to catch up in coverage too.

> We can discuss the pros/cons there if you like.

I would love to :).

> I have agreed to move the performance related tests over to perf, and
> Davidlohr has added some other such tests to perf. Trinity now covers
> the planned fuzz testing for futexes (very well... Obviously) so that
> idea will be dropped, leaving pure functional tests in futextest.

Well LTP mostly consists of functional tests, so that would fit the
purpose very well.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 16:17                 ` chrubis
@ 2014-05-15 16:56                   ` H. Peter Anvin
  2014-05-15 17:06                     ` chrubis
  0 siblings, 1 reply; 80+ messages in thread
From: H. Peter Anvin @ 2014-05-15 16:56 UTC (permalink / raw)
  To: chrubis
  Cc: Michael Kerrisk (man-pages),
	Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

On 05/15/2014 09:17 AM, chrubis@suse.cz wrote:
>>
>> It should be quite easy to extract from klibc.
> 
> That is not the main concern here. If I extract the code I would have to
> watch for any changes manually. If it was in a library or a separate
> repository all that would be needed is to add it as dependency/git
> submodule and I would get all updates automatically.
> 

Yes, and for that to happen someone needs to do the work to extract it.
 I don't have the cycles myself at the moment.

	-hpa


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 16:56                   ` H. Peter Anvin
@ 2014-05-15 17:06                     ` chrubis
  0 siblings, 0 replies; 80+ messages in thread
From: chrubis @ 2014-05-15 17:06 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michael Kerrisk (man-pages),
	Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Carlos O'Donell

Hi!
> > That is not the main concern here. If I extract the code I would have to
> > watch for any changes manually. If it was in a library or a separate
> > repository all that would be needed is to add it as dependency/git
> > submodule and I would get all updates automatically.
> > 
> 
> Yes, and for that to happen someone needs to do the work to extract it.
>  I don't have the cycles myself at the moment.

If that is the only problem, I should be able to allocate some time in
order to have a look at it.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 16:30       ` chrubis
@ 2014-05-15 18:17         ` Darren Hart
  2014-05-15 19:05           ` chrubis
  0 siblings, 1 reply; 80+ messages in thread
From: Darren Hart @ 2014-05-15 18:17 UTC (permalink / raw)
  To: chrubis
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

On 5/15/14, 9:30, "chrubis@suse.cz" <chrubis@suse.cz> wrote:

>Hi!
>> I've used LTP in the past (quite a bit), and I felt there was some
>> advantage to keeping futextest independent.
>
>What advantages did you have in mind?

Not CVS was a big one at the time ;-)

OK, I don't mean to be disparaging here... But since you asked, back in
'09 LTP had some test quality issues and I felt I could maintain futextest
to a higher bar independently.

>
>> Perhaps things have changed enough since then (~2009 era) that we
>> should reconsider.
>
>I've been working on LTP for a about three years now and we happen to do
>quite a lot in that time. The most visible changes would be more proper
>development practices (git, proper build system, code review, LKML
>coding style, documentation, ...) and also huge number of fixes. Now we
>are trying to catch up in coverage too.
>
>> We can discuss the pros/cons there if you like.
>
>I would love to :).

Does LTP need to own the code, or can it incorporate existing projects and
a sort of aggregator?

How much LTP harness type code needs to be used?

-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 18:17         ` Darren Hart
@ 2014-05-15 19:05           ` chrubis
  2014-05-15 19:38             ` Darren Hart
  0 siblings, 1 reply; 80+ messages in thread
From: chrubis @ 2014-05-15 19:05 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

Hi!
> >> I've used LTP in the past (quite a bit), and I felt there was some
> >> advantage to keeping futextest independent.
> >
> >What advantages did you have in mind?
> 
> Not CVS was a big one at the time ;-)
> 
> OK, I don't mean to be disparaging here... But since you asked, back in
> '09 LTP had some test quality issues and I felt I could maintain futextest
> to a higher bar independently.

To be honest LTP was one of the messiest codebases I've seen and it was
hacked up by mostly clueless people (there were even tests with race
conditions that were carefully disabled in a way that was not easy to
see). It took me months to get to a state where it compiled fine on
major distributions.

Today we still have quite a bit of legacy code that needs to be cleaned
up, however that gets better every day.

And most of the testcases are pretty stable, etc. unfortunatelly LTP has
a bad reputation which is lot harder to fix than the code itself.

> >> Perhaps things have changed enough since then (~2009 era) that we
> >> should reconsider.
> >
> >I've been working on LTP for a about three years now and we happen to do
> >quite a lot in that time. The most visible changes would be more proper
> >development practices (git, proper build system, code review, LKML
> >coding style, documentation, ...) and also huge number of fixes. Now we
> >are trying to catch up in coverage too.
> >
> >> We can discuss the pros/cons there if you like.
> >
> >I would love to :).
> 
> Does LTP need to own the code, or can it incorporate existing projects and
> a sort of aggregator?

That is possible as well but not optimal. This approach would need a
wrapper script to convert the test exit values to be LTP compatible.

> How much LTP harness type code needs to be used?

Not much.

For this complexity of tests you would just need to call the tst_resm()
interface to report success/failure and, at the end of the test,
tst_exit() to return the stored overall test status.

And ideally call the standard option parsing code and call the test in
standard loop so that the test can take advantage of standard options as
number of iterations to run, etc.

Have a look at:

https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines

there is simple test example as well as description of the interfaces.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15  0:28       ` H. Peter Anvin
  2014-05-15  0:35         ` Andy Lutomirski
@ 2014-05-15 19:10         ` Carlos O'Donell
  1 sibling, 0 replies; 80+ messages in thread
From: Carlos O'Donell @ 2014-05-15 19:10 UTC (permalink / raw)
  To: H. Peter Anvin, Davidlohr Bueso, mtk.manpages
  Cc: Darren Hart, Thomas Gleixner, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API

On 05/14/2014 08:28 PM, H. Peter Anvin wrote:
> On 05/14/2014 01:56 PM, Davidlohr Bueso wrote:
>>>
>>>> However, unless I'm sorely mistaken, the larger problem is that glibc
>>>> removed the futex() call entirely, so these man pages don't describe
>>>
>>> I don't think futex() ever was in glibc--that's by design, and
>>> completely understandable: no user-space application would want to
>>> directly use futex(). 
>>
>> That's actually not quite true. There are plenty of software efforts out
>> there that use futex calls directly to implement userspace serialization
>> mechanisms as an alternative to the bulky sysv semaphores. I worked
>> closely with an in-memory DB project that makes heavy use of them. Not
>> everyone can simply rely on pthreads.
>>
> 
> More fundamentally, futex(2), like clone(2), are things that can be
> legitimately by user space without automatically breaking all of glibc.
>  There are some other things where that is *not* true, because glibc
> relies on being able to mediate all accesses to a kernel facility, but
> not here.

Careful there. There is *some* danger in using clone(2) because of the
coordination required to implement thread-local storage. I'm sure you're
aware of this, but I'd like the record to show that we're going to need
clear documentation of what's considered safe given the known
implementations.

Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 19:05           ` chrubis
@ 2014-05-15 19:38             ` Darren Hart
  2014-08-11 10:19               ` chrubis
                                 ` (2 more replies)
  0 siblings, 3 replies; 80+ messages in thread
From: Darren Hart @ 2014-05-15 19:38 UTC (permalink / raw)
  To: chrubis
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

On 5/15/14, 12:05, "chrubis@suse.cz" <chrubis@suse.cz> wrote:

>Hi!
>> >> I've used LTP in the past (quite a bit), and I felt there was some
>> >> advantage to keeping futextest independent.
>> >
>> >What advantages did you have in mind?
>> 
>> Not CVS was a big one at the time ;-)
>> 
>> OK, I don't mean to be disparaging here... But since you asked, back in
>> '09 LTP had some test quality issues and I felt I could maintain
>>futextest
>> to a higher bar independently.
>
>To be honest LTP was one of the messiest codebases I've seen and it was
>hacked up by mostly clueless people (there were even tests with race
>conditions that were carefully disabled in a way that was not easy to
>see). It took me months to get to a state where it compiled fine on
>major distributions.
>
>Today we still have quite a bit of legacy code that needs to be cleaned
>up, however that gets better every day.
>
>And most of the testcases are pretty stable, etc. unfortunatelly LTP has
>a bad reputation which is lot harder to fix than the code itself.
>
>> >> Perhaps things have changed enough since then (~2009 era) that we
>> >> should reconsider.
>> >
>> >I've been working on LTP for a about three years now and we happen to
>>do
>> >quite a lot in that time. The most visible changes would be more proper
>> >development practices (git, proper build system, code review, LKML
>> >coding style, documentation, ...) and also huge number of fixes. Now we
>> >are trying to catch up in coverage too.
>> >
>> >> We can discuss the pros/cons there if you like.
>> >
>> >I would love to :).
>> 
>> Does LTP need to own the code, or can it incorporate existing projects
>>and
>> a sort of aggregator?
>
>That is possible as well but not optimal. This approach would need a
>wrapper script to convert the test exit values to be LTP compatible.
>
>> How much LTP harness type code needs to be used?
>
>Not much.
>
>For this complexity of tests you would just need to call the tst_resm()
>interface to report success/failure and, at the end of the test,
>tst_exit() to return the stored overall test status.
>
>And ideally call the standard option parsing code and call the test in
>standard loop so that the test can take advantage of standard options as
>number of iterations to run, etc.
>
>Have a look at:
>
>https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines
>
>there is simple test example as well as description of the interfaces.


Thanks Cyril,

I'll follow up with you in a couple weeks most likely. I have some urgent
things that will be taking all my time and then some until then. Feel free
to poke me though if I lose track of it :-)

-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 14:14           ` Thomas Gleixner
@ 2014-05-15 20:19             ` Michael Kerrisk (man-pages)
  2014-08-04 14:46               ` Carlos O'Donell
  2014-05-15 20:35             ` Darren Hart
  2015-01-15 15:10             ` Michael Kerrisk (man-pages)
  2 siblings, 1 reply; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-05-15 20:19 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mtk.manpages, Carlos O'Donell, Darren Hart, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API

On 05/15/2014 04:14 PM, Thomas Gleixner wrote:
> On Thu, 15 May 2014, Michael Kerrisk (man-pages) wrote:
>> And that universe would love to have your documentation of
>> FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),
> 
> I give you almost the full treatment, but I leave REQUEUE_PI to Darren
> and FUTEX_WAKE_OP to Jakub. :)

Thanks Thomas--that's fantastic! Hopefully, Darren and Jakub fill in those
missing pieces...

Cheers,

Michael


> FUTEX_WAIT
> 
> 	< Existing blurb seems ok >
> 
> 	Related return values
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr.
> 
> 	[EINVAL] The supplied uaddr argument does not point to a valid
> 		 object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] The supplied timeout argument is not normalized.
> 
> 	[EWOULDBLOCK] The atomic enqueueing failed. User space value
> 		      at uaddr is not equal val argument.
> 
> 	[ETIMEDOUT] timeout expired 
> 
> 
> FUTEX_WAKE
> 
> 	< Existing blurb seems ok >
> 
> 	Related return values
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr.
> 
> 	[EINVAL] The supplied uaddr argument does not point to a valid
> 		 object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_LOCK_PI
> 
> FUTEX_REQUEUE
> 
> 	Existing blurb seems ok , except for this:
> 
> 	The argument val contains the number of waiters on uaddr which
> 	are immediately woken up.
> 
> 	The timeout argument is abused to transport the number of
> 	waiters which are requeued to the futex at uaddr2. The pointer
> 	is typecasted to u32.
> 
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr or uaddr2
> 
> 	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
> 		 valid object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_LOCK_PI on uaddr
> 
> 	[EINVAL] uaddr equal uaddr2. Requeue to same futex.
> 
> FUTEX_REQUEUE_CMP
> 
> 	Existing blurb seems ok , except for this:
> 
> 	The argument val is contains the number of waiters on uaddr
> 	which are immediately woken up.
> 
> 	The timeout argument is abused to transport the number of
> 	waiters which are requeued to the futex at uaddr2. The pointer
> 	is typecasted to u32.
> 
> 	Related return values
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr or uaddr2
> 
> 	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
> 		 valid object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] uaddr equal uaddr2. Requeue to same futex.
> 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_LOCK_PI on uaddr
> 
> 	[EAGAIN] uaddr1 readout is not equal the compare value in
> 		 argument val3
> 
> FUTEX_WAKE_OP
> 
> 
> Jakub, can you please explain it? I'm lost :)
> 
> 
> 	The argument val contains the maximum number of waiters on
> 	uaddr which are immediately woken up.
> 
> 	The timeout argument is abused to transport the maximum
> 	number of waiters on uaddr2 which are woken up. The pointer
> 	is typecasted to u32.
> 
> 	Related return values
> 
> 	[EFAULT] Kernel was unable to access the futex values at uaddr
> 		 or uaddr2
> 
> 	[EINVAL] The supplied uaddr or uaddr2 argument does not point
> 		 to a valid object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_LOCK_PI on uaddr
> 
> 
> FUTEX_WAIT_BITSET
> 
> 	The same as FUTEX_WAIT except that val3 is used to provide a
> 	32bit bitset to the kernel. This bitset is stored in the
> 	kernel internal state of the waiter.
> 
> 	This futex op also allows to have the option bit
> 	FUTEX_CLOCK_REALTIME set.
> 
> 	Related return values
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr.
> 
> 	[EINVAL] The supplied uaddr argument does not point to a valid
> 		 object, i.e. pointer is not 4 byte aligned
> 
>   	[EINVAL] The supplied bitset is zero.
> 
> 	[EINVAL] The supplied timeout argument is not normalized.
> 
> 	[ETIMEDOUT] timeout expired 
> 
> 
> FUTEX_WAKE_BITSET
> 
> 	The same as FUTEX_WAKE except that val3 is used to provide a
> 	32bit bitset to the kernel. This bitset is used to select
> 	waiters on the futex. The selection is done by a bitwise AND
> 	of the wake side supplied bitset and the bitset which is
> 	stored in the kernel internal state of the waiters. If the
> 	result is non zero, the waiter is woken, otherwise left
> 	waiting.
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr.
> 
> 	[EINVAL] The supplied uaddr argument does not point to a valid
> 		 object, i.e. pointer is not 4 byte aligned
> 
>   	[EINVAL] The supplied bitset is zero.
> 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_LOCK_PI
> 
> FUTEX_LOCK_PI
> 
> 	This operation reads from the futex address provided by the
> 	uaddr argument, which contains the namespace specific TID of
> 	the lock owner. If the TID is 0, then the kernel tries to set
> 	the waiters TID atomically. If the TID is nonzero or the take
> 	over fails the kernel sets atomically the FUTEX_WAITERS bit
> 	which signals the owner, that it cannot unlock the futex in
> 	user space atomically by transitioning from TID to 0. After
> 	that the kernel tries to find the task which is associated to
> 	the owner TID, creates or reuses kernel state on behalf of the
> 	owner and attaches the waiter to it. The enqueing of the
> 	waiter is in descending priority order if more than one waiter
> 	exists. The owner inherits either the priority or the
> 	bandwidth of the waiter. This inheritance follows the lock
> 	chain in the case of nested locking and performs deadlock
> 	detection.
> 
> 	The timeout argument is handled as described in FUTEX_WAIT.
> 	The arguments uaddr2, val, and val3 are ignored.
> 
> 	Related return values
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr.
> 
> 	[ENOMEM] Kernel could not allocate state
> 
> 	[EINVAL] The supplied uaddr argument does not point to a valid
> 		 object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] The supplied timeout argument is not normalized.
> 		 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state. Thats
> 		 either state corruption or it found a waiter on uaddr
> 		 which is waiting on FUTEX_WAIT[_BITSET]
> 
> 	[EPERM]  Caller is not allowed to attach itself to the futex.
> 		 Can be a legitimate issue or a hint for state
> 		 corruption in user space
> 
> 	[ESRCH]	 The TID in the user space value does not exist
> 
> 	[EAGAIN] The futex owner TID is about to exit, but has not yet
> 		 handled the internal state cleanup. Try again.	 
> 
> 	[ETIMEDOUT] timeout expired 
> 
> 	[EDEADLOCK] The futex is already locked by the caller or the kernel
> 		    detected a deadlock scenario in a nested lock chain
> 
> 	[EOWNERDIED] The owner of the futex died and the kernel made the
> 		     caller the new owner. The kernel sets the
> 		     FUTEX_OWNER_DIED bit in the futex userspace value.
> 		     Caller is responsible for cleanup
> 
>         [ENOSYS] Not implemented on all architectures and not supported
> 		 on some CPU variants  (runtime detection)
> 		     
> FUTEX_TRYLOCK_PI
> 
> 	This operation tries to acquire the futex at uaddr. It deals
> 	with the situation where the TID value at uaddr is 0, but the
> 	FUTEX_HAS_WAITER bit is set. User space cannot handle this
> 	race free.
> 
> 	The arguments uaddr2, val, timeout and val3 are ignored.
> 
> 	Return values:
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr.
> 
> 	[ENOMEM] Kernel could not allocate state
> 
> 	[EINVAL] The supplied uaddr argument does not point to a valid
> 		 object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] The kernel detected inconsistent state between the user
> 		 space state at uaddr and the kernel state
> 
> 	[EPERM]  Caller is not allowed to attach itself to the futex.
> 		 Can be a legitimate issue or a hint for state
> 		 corruption in user space
> 
> 	[ESRCH]	 The TID in the user space value does not exist
> 
> 	[EAGAIN] The futex owner TID is about to exit, but has not yet
> 		 handled the internal state cleanup. Try again.	 
> 
> 	[EDEADLOCK] The futex is already locked by the caller.
> 
> 	[EOWNERDIED] The owner of the futex died and the kernel made the
> 		     caller the new owner. The kernel sets the
> 		     FUTEX_OWNER_DIED bit in the futex userspace value.
> 		     Caller is responsible for cleanup
> 
>         [ENOSYS] Not implemented on all architectures and not supported
> 		 on some CPU variants (runtime detection)
> 
> FUTEX_UNLOCK_PI
> 
> 	This operation wakes the top priority waiter which is waiting
> 	in FUTEX_LOCK_PI on the futex address provided by the uaddr
> 	argument.
> 
> 	This is called when the user space value at uaddr cannot be
> 	changed atomically from TID (of the owner) to 0.
> 
> 	The arguments uaddr2, val, timeout and val3 are ignored.
> 
> 	Related return values:
> 	
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_WAIT[_BITSET].
> 
> 	[EPERM]  Caller does not own the futex.
> 
>         [ENOSYS] Not implemented on all architectures and not supported
> 		 on some CPU variants (runtime detection)
> 
> FUTEX_WAIT_REQUEUE_PI
> 
> 	Wait operation to wait on a non pi futex at uaddr and
> 	potentially be requeued on a pi futex at uaddr2. The wait
> 	operation on uaddr is the same as FUTEX_WAIT. The waiter can
> 	be removed from the wait on uaddr via FUTEX_WAKE without
> 	requeuing on uaddr2.
> 
> 	The timeout argument is handled as described in FUTEX_WAIT.
> 
> Darren, can you fill in the missing details?
> 
> 	Return values:
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr
> 		 or uaddr2
> 
> 	[EINVAL] The supplied uaddr or uaddr2 argument does not point
> 		 to a valid object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] The supplied timeout argument is not normalized.
> 
>   	[EINVAL] The supplied bitset is zero.
> 
> 	[EWOULDBLOCK] The atomic enqueueing failed. User space value
> 		      at uaddr is not equal val argument.
> 
> 	[ETIMEDOUT] timeout expired 
> 
> 	[EOWNERDIED] The owner of the PI futex at uaddr2 died and the
> 		     kernel made the caller the new owner. The kernel
> 		     sets the FUTEX_OWNER_DIED bit in the uaddr2 futex
> 		     userspace value.  Caller is responsible for
> 		     cleanup
> 
>         [ENOSYS] Not implemented on all architectures and not supported
> 		 on some CPU variants (runtime detection)
> 
> 
> FUTEX_CMP_REQUEUE_PI
> 
> 	PI aware variant of FUTEX_CMP_REQUEUE. Inner futex at uaddr is
> 	a non PI futex. Outer futex to which is requeued is a PI futex
> 	at uaddr2.
> 
> 	The waiters on uaddr must wait in FUTEX_WAIT_REQUEUE_PI.
> 
> 	The argument val is contains the number of waiters on uaddr
> 	which are immediately woken up. Must be 1 for this opcode.
> 
> 	The timeout argument is abused to transport the number of
> 	waiters which are requeued on to the futex at uaddr2. The
> 	pointer is typecasted to u32.
> 
> Darren, can you fill in the missing details?
> 
> 	[EFAULT] Kernel was unable to access the futex value at uaddr
> 		 or uaddr2
> 
> 	[ENOMEM] Kernel could not allocate state
> 
> 	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
> 		 valid object, i.e. pointer is not 4 byte aligned
> 
> 	[EINVAL] uaddr equal uaddr2. Requeue to same futex.
> 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_LOCK_PI on uaddr
> 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_WAIT[_BITSET] on uaddr
> 
> 	[EINVAL] The kernel detected inconsistent state between the
> 		 user space state at uaddr2 and the kernel state,
> 		 i.e. it detected a waiter which waits in
> 		 FUTEX_WAIT on uaddr2.
> 
>   	[EINVAL] The supplied bitset is zero.
> 
> 	[EAGAIN] uaddr1 readout is not equal the compare value in
> 		 argument val3
> 
> 	[EAGAIN] The futex owner TID of uaddr2 is about to exit, but
> 		 has not yet handled the internal state cleanup. Try
> 		 again.
> 
> 	[EPERM]  Caller is not allowed to attach the waiter to the
> 		 futex at uaddr2 Can be a legitimate issue or a hint
> 		 for state corruption in user space
> 
> 	[ESRCH]	 The TID in the user space value at uaddr2 does not exist
> 
> 	[EDEADLOCK] The requeuing of a waiter to the kernel representation
> 		    of the PI futex at uaddr2 detected a deadlock scenario.
> 
>         [ENOSYS] Not implemented on all architectures and not supported
> 		 on some CPU variants (runtime detection)
> 
> 
> The various option bits seem to be undocumented as well
> 
> FUTEX_PRIVATE_FLAG
> 
> 	This option bit can be ored on all futex ops.
> 
> 	It tells the kernel, that the futex is process private and not
> 	shared with another process. That allows the kernel to chose
> 	the fast path for validating the user space address and avoids
> 	expensive VMA lookup, taking refcounts on file backing store
> 	etc.
> 
> FUTEX_CLOCK_REALTIME
> 
> 	This option bit can be ored on the futex ops FUTEX_WAIT_BITSET
> 	and FUTEX_WAIT_REQUEUE_PI
> 
> 	If set the kernel treats the user space supplied timeout as
> 	absolute time based on CLOCK_REALTIME.
> 
> 	If not set the kernel treats the user space supplied timeout
> 	as relative time.
> 
> 	If this is set on any other op than the supported ones, kernel
> 	returns ENOSYS!
> 
> 
> Thanks,
> 
> 	tglx
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 14:14           ` Thomas Gleixner
  2014-05-15 20:19             ` Michael Kerrisk (man-pages)
@ 2014-05-15 20:35             ` Darren Hart
  2015-01-15 15:12               ` Michael Kerrisk (man-pages)
  2015-01-15 15:10             ` Michael Kerrisk (man-pages)
  2 siblings, 1 reply; 80+ messages in thread
From: Darren Hart @ 2014-05-15 20:35 UTC (permalink / raw)
  To: Thomas Gleixner, Michael Kerrisk (man-pages)
  Cc: Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API

On 5/15/14, 7:14, "Thomas Gleixner" <tglx@linutronix.de> wrote:

Wow Thomas, I planned to do exactly this and you beat me to it. Again.
Thanks for getting this started.

Michael, I imagine you want something more condensed, and I'll add to what
tglx posted (inline below) to try and get you that, but if you have
questions and need to fill in the gap, the paper I presented at RTLWS11 in
'09 covers this particularly nasty OPCODE in detail:

http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf

I believe Michael is looking for some higher level documentation, like how
to use these and what they are intended for. Probably something more like
Ulrich's Futexes are Tricky paper - but let's start with getting the op
codes, arguments, and return codes fleshed out.



For all the PI opcodes, we should probably mention something about the
futex value scheme (TID), whereas the other opcodes do not require any
specific value scheme.

No Owner:	0
Owner:		TID
Waiters:	TID | FUTEX_WAITERS

This is the relevant section from the referenced paper:


	
		
		
	
	
		
			
				
					The PI futex operations diverge from the oth-
ers in that they impose a policy describing how
the futex value is to be used. If the lock is un-
owned, the futex value shall be 0. If owned, it
shall be the thread id (tid) of the owning thread.
If there are threads contending for the lock, then
the FUTEX_WAITERS flag is set. With this policy in
place, userspace can atomically acquire an unowned
lock or release an uncontended lock using an atomic
instruction and their own tid. A non-zero futex
value will force waiters into the kernel to lock. The
FUTEX_WAITERS flag forces the owner into the kernel
to unlock. If the callers are forced into the kernel,
they then deal directly with an underlying rt_mutex
which implements the priority inheritance semantics.
After the rt_mutex is acquired, the futex value is up-
dated accordingly, before the calling thread returns
to userspace.

				
			
		
	
It is important to note that the kernel will update the futex value prior
to returning to userspace. Unlike other futex op codes,
FUTEX_CMP_REUQUE_PI (and FUTEX_WAIT_REQUEUE_PI, FUTEX_LOCK_PI are designed
for the implementation of very specific IPC mechanisms).


>FUTEX_CMP_REQUEUE_PI
>
>	PI aware variant of FUTEX_CMP_REQUEUE. Inner futex at uaddr is
>	a non PI futex. Outer futex to which is requeued is a PI futex
>	at uaddr2.

Inner/outer terminology applies specifically to the glibc pthread
condition variable and mutex use case, but is overly specific for the man
page. Consider:

PI aware variant for FUTEX_CMP_REQUEUE. Requeue tasks blocked on uaddr via
FUTEX_WAIT_REQUEUE_PI from a non-PI source futex (uaddr) to a PI target
futex (uaddr2).

>
>	The waiters on uaddr must wait in FUTEX_WAIT_REQUEUE_PI.
>
>	The argument val is contains the number of waiters on uaddr
>	which are immediately woken up. Must be 1 for this opcode.

Because the point is to avoid the thundering herd in the first place, and
other nasty little races and faulting corner cases...

>
>	The timeout argument is abused to transport the number of
>	waiters which are requeued on to the futex at uaddr2. The
>	pointer is typecasted to u32.


          val3 contains the expected value of uaddr (same as
FUTEX_CMP_REQUEUE)


>
>Darren, can you fill in the missing details?

Yup...

>
>	[EFAULT] Kernel was unable to access the futex value at uaddr
>		 or uaddr2
>
>	[ENOMEM] Kernel could not allocate state
>
>	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
>		 valid object, i.e. pointer is not 4 byte aligned
>
>	[EINVAL] uaddr equal uaddr2. Requeue to same futex.
>
>	[EINVAL] The kernel detected inconsistent state between the
>		 user space state at uaddr and the kernel state,
>		 i.e. it detected a waiter which waits in
>		 FUTEX_LOCK_PI on uaddr

                   instead of FUTEX_WAIT_REQUEUE_PI.

>
>	[EINVAL] The kernel detected inconsistent state between the
>		 user space state at uaddr and the kernel state,
>		 i.e. it detected a waiter which waits in
>		 FUTEX_WAIT[_BITSET] on uaddr
>
>	[EINVAL] The kernel detected inconsistent state between the
>		 user space state at uaddr2 and the kernel state,
>		 i.e. it detected a waiter which waits in
>		 FUTEX_WAIT on uaddr2.

          [EINVAL] The kernel detected the FUTEX_CMP_REQUEUE_PI call is
                   attempting to requeue a task to a futex other than that
                   specified by the matching FUTEX_WAIT_REQUEUE_PI call for
                   that task.

A number of these EINVALs can probably be combined into "Kernel detected
bad state" as far as the C library is concerned, but we can consolidate
later. But basically, EINVAL is returned if the non-pi to pi or op pairing
semantics are violated.



>
>  	[EINVAL] The supplied bitset is zero.

Bitset doesn't apply to FUTEX_CMP_REQUEUE_PI.

          [EINVAL] nr_wake != 1


EAGAIN == EWOULDBLOCK. We use each in the kernel, but will just refer to
them here as EAGAIN.

>	[EAGAIN] uaddr1 readout is not equal the compare value in
>		 argument val3
>
>	[EAGAIN] The futex owner TID of uaddr2 is about to exit, but
>		 has not yet handled the internal state cleanup. Try
>		 again.
>
>	[EPERM]  Caller is not allowed to attach the waiter to the
>		 futex at uaddr2 Can be a legitimate issue or a hint
>		 for state corruption in user space
>
>	[ESRCH]	 The TID in the user space value at uaddr2 does not exist

Hrm, I'm missing ESRCH and EPERM in my state diagrams.... put yes, we can
get ESRCH when looking up PI state, and we can return that from
futex_requeue.... That needs some time to review...

I'm not seeing the EPERM path, where is that coming from?




>
>	[EDEADLOCK] The requeuing of a waiter to the kernel representation
>		    of the PI futex at uaddr2 detected a deadlock scenario.
>
>        [ENOSYS] Not implemented on all architectures and not supported
>		 on some CPU variants (runtime detection)

Return value >= 0 is successful, indicating the number of of tasks
requeued or woken (3 requeued and 1 woken would return 4).

Thanks,

-- 
Darren Hart					Open Source Technology Center
darren.hart@intel.com				            Intel Corporation




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 20:19             ` Michael Kerrisk (man-pages)
@ 2014-08-04 14:46               ` Carlos O'Donell
  0 siblings, 0 replies; 80+ messages in thread
From: Carlos O'Donell @ 2014-08-04 14:46 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages), Thomas Gleixner
  Cc: Darren Hart, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API

On 05/15/2014 04:19 PM, Michael Kerrisk (man-pages) wrote:
> On 05/15/2014 04:14 PM, Thomas Gleixner wrote:
>> On Thu, 15 May 2014, Michael Kerrisk (man-pages) wrote:
>>> And that universe would love to have your documentation of
>>> FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),
>>
>> I give you almost the full treatment, but I leave REQUEUE_PI to Darren
>> and FUTEX_WAKE_OP to Jakub. :)
> 
> Thanks Thomas--that's fantastic! Hopefully, Darren and Jakub fill in those
> missing pieces...

Michael,

Do you need any help getting these additional futex error codes
into the linux kernel man pages project? Thomas provided the
missing bits and Darren commented... what else do we need?

I'm asking because I want to point other Red Hat engineers at
these pages to say: "these are the canonical error codes." 

We're trying to cleanup the userspace side of things.

Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 19:38             ` Darren Hart
@ 2014-08-11 10:19               ` chrubis
  2014-11-26 13:41               ` Cyril Hrubis
  2015-02-16 13:14               ` Cyril Hrubis
  2 siblings, 0 replies; 80+ messages in thread
From: chrubis @ 2014-08-11 10:19 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

Hi!
> >> How much LTP harness type code needs to be used?
> >
> >Not much.
> >
> >For this complexity of tests you would just need to call the tst_resm()
> >interface to report success/failure and, at the end of the test,
> >tst_exit() to return the stored overall test status.
> >
> >And ideally call the standard option parsing code and call the test in
> >standard loop so that the test can take advantage of standard options as
> >number of iterations to run, etc.
> >
> >Have a look at:
> >
> >https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines
> >
> >there is simple test example as well as description of the interfaces.
> 
> 
> Thanks Cyril,
> 
> I'll follow up with you in a couple weeks most likely. I have some urgent
> things that will be taking all my time and then some until then. Feel free
> to poke me though if I lose track of it :-)

Ping :)

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 19:38             ` Darren Hart
  2014-08-11 10:19               ` chrubis
@ 2014-11-26 13:41               ` Cyril Hrubis
  2015-02-16 13:14               ` Cyril Hrubis
  2 siblings, 0 replies; 80+ messages in thread
From: Cyril Hrubis @ 2014-11-26 13:41 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

Hi!
> >For this complexity of tests you would just need to call the tst_resm()
> >interface to report success/failure and, at the end of the test,
> >tst_exit() to return the stored overall test status.
> >
> >And ideally call the standard option parsing code and call the test in
> >standard loop so that the test can take advantage of standard options as
> >number of iterations to run, etc.
> >
> >Have a look at:
> >
> >https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines
> >
> >there is simple test example as well as description of the interfaces.
> 
> 
> Thanks Cyril,
> 
> I'll follow up with you in a couple weeks most likely. I have some urgent
> things that will be taking all my time and then some until then. Feel free
> to poke me though if I lose track of it :-)

Do you still plan to work on this?

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 14:14           ` Thomas Gleixner
  2014-05-15 20:19             ` Michael Kerrisk (man-pages)
  2014-05-15 20:35             ` Darren Hart
@ 2015-01-15 15:10             ` Michael Kerrisk (man-pages)
  2015-01-15 22:23               ` Thomas Gleixner
  2015-01-23 18:29               ` Torvald Riegel
  2 siblings, 2 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-15 15:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mtk.manpages, Carlos O'Donell, Darren Hart, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API, Torvald Riegel,
	Roland McGrath, Darren Hart, Anton Blanchard, Peter Zijlstra,
	Petr Baudis, Eric Dumazet, bill o gallmeister, Jan Kiszka,
	Daniel Wagner, Rich Felker

[Adding a few people to CC that have expressed interest in the 
progress of the updates of this page, or who may be able to
provide review feedback. Eventually, you'll all get CCed on
the new draft of the page.]

Hello Thomas,

On 05/15/2014 04:14 PM, Thomas Gleixner wrote:
> On Thu, 15 May 2014, Michael Kerrisk (man-pages) wrote:
>> And that universe would love to have your documentation of 
>> FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),
> 
> I give you almost the full treatment, but I leave REQUEUE_PI to
> Darren and FUTEX_WAKE_OP to Jakub. :)

Thank you for the great effort you put into compiling the
text below, and apologies for my long delay in following up.

I've integrated almost all of your suggestions into the 
manual page. I will shortly send out a new draft of the
page that contains various FIXMEs for points that remain 
unclear.

Most of the rest of this mail is just a checklist noting
what I did with your comments. No response is needed 
in most cases, but there are a very few open questions in 
this mail that, to help you find them, I have marked with
"???". If you (or someone else) could reply to those, I 
would be grateful.

In the next day or two, I hope to send out the new version
of the futex(2) page for review. The new draft is a bit
bigger (okay -- 4 x bigger) than the current page. And there 
are a quite number of FIXMEs that I've placed in the page 
for various points--some minor, but a few major--that need
to be checked or fixed. Would you have some time to review
that page? 

For that matter, if anyone else would have time for
reviewing the page, could they shout out now. It's perhaps
unlikely, but I am worried about getting a thundering herd
of comments, and bringing the page to the state where I have 
it now has already been a fairly demanding task.

==========

> FUTEX_WAIT
> 
> < Existing blurb seems ok >
> 
> Related return values
> 
> [EFAULT] Kernel was unable to access the futex value at uaddr.

Added/reworked.

> [EINVAL] The supplied uaddr argument does not pouint to a valid 
> object, i.e. pointer is not 4 byte aligned

Added.

> [EINVAL] The supplied timeout argument is not normalized.

Added, but with more detail.

> [EWOULDBLOCK] The atomic enqueueing failed. 

Added.

Note, however, that for consistency, I'll use EAGAIN throughout 
the page.

>  User space value at uaddr
> is not equal val argument.

Was already present.

> [ETIMEDOUT] timeout expired

Was present, but I have now added more detail.

==========

> FUTEX_WAKE
> 
> < Existing blurb seems ok >
> 
> Related return values
> 
> [EFAULT] Kernel was unable to access the futex value at uaddr.

Added/reworked.

> [EINVAL] The supplied uaddr argument does not point to a valid 
> object, i.e. pointer is not 4 byte aligned

Added.

> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr and the kernel state, i.e. it detected a waiter
> which waits in FUTEX_LOCK_PI

Added.

==========

> FUTEX_REQUEUE
> 
> Existing blurb seems ok , except for this:
> 
> The argument val contains the number of waiters on uaddr which are
> immediately woken up.
> The timeout argument is abused to transport the number of waiters
> which are requeued to the futex at uaddr2. The pointer is typecasted
> to u32.

What I've actually done with the main text for FUTEX_REQUEUE is defer 
to the (now-expanded) discussion of FUTEX_CMP_REQUEUE. 

> [EFAULT] Kernel was unable to access the futex value at uaddr or
> uaddr2

Added/reworked.

> [EINVAL] The supplied uaddr/uaddr2 arguments do not point to a valid
> object, i.e. pointer is not 4 byte aligned

Added.

> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr and the kernel state, i.e. it detected a waiter
> which waits in FUTEX_LOCK_PI on uaddr

Added.

> [EINVAL] uaddr equal uaddr2. Requeue to same futex.

??? I added this, but does this error not occur only for PI requeues?

==========

> FUTEX_REQUEUE_CMP
> 
> Existing blurb seems ok , except for this:

[[
> The argument val is contains the number of waiters on uaddr which are
> immediately woken up.
> 
> The timeout argument is abused to transport the number of waiters
> which are requeued to the futex at uaddr2. The pointer is typecasted
> to u32.
]]

Covered now (in more detail).

> Related return values
> 
> [EFAULT] Kernel was unable to access the futex value at uaddr or
> uaddr2

Added/reworked.

> [EINVAL] The supplied uaddr/uaddr2 arguments do not point to a valid
> object, i.e. pointer is not 4 byte aligned

Added.

> [EINVAL] uaddr equal uaddr2. Requeue to same futex.

Added.

> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr and the kernel state, i.e. it detected a waiter
> which waits in FUTEX_LOCK_PI on uaddr

Added

> [EAGAIN] uaddr1 readout is not equal the compare value in argument
> val3

Was already present.

==========

> FUTEX_WAKE_OP
> 
> 
> Jakub, can you please explain it? I'm lost :)

I had a read of Ulrich Drepper's "Futexes are Tricky", and the source 
code, and took a shot at it. I'd like to have someone check what 
I wrote though. See the draft that I will soon send out.

> The argument val contains the maximum number of waiters on uaddr
> which are immediately woken up.

Covered in my new text.

> The timeout argument is abused to transport the maximum number of
> waiters on uaddr2 which are woken up. The pointer is typecasted to
> u32.

Covered in my new text.

> Related return values
> 
> [EFAULT] Kernel was unable to access the futex values at uaddr or
> uaddr2

This point was covered already in ERRORS.

> [EINVAL] The supplied uaddr or uaddr2 argument does not point to a
> valid object, i.e. pointer is not 4 byte aligned

This point was covered already in ERRORS.

> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr and the kernel state, i.e. it detected a waiter
> which waits in FUTEX_LOCK_PI on uaddr

I added this point.

==========

> FUTEX_WAIT_BITSET
> 
> The same as FUTEX_WAIT except that val3 is used to provide a 32bit
> bitset to the kernel. This bitset is stored in the kernel internal
> state of the waiter.

Added.

> This futex op also allows to have the option bit FUTEX_CLOCK_REALTIME
> set.

Added.

> Related return values
> 
> [EFAULT] Kernel was unable to access the futex value at uaddr.

Already covered.

> [EINVAL] The supplied uaddr argument does not point to a valid 
> object, i.e. pointer is not 4 byte aligned

Already covered.

> [EINVAL] The supplied bitset is zero.

Added.

> [EINVAL] The supplied timeout argument is not normalized.

Already covered.

> [ETIMEDOUT] timeout expired

Already covered.

==========

> FUTEX_WAKE_BITSET
> 
> The same as FUTEX_WAKE except that val3 is used to provide a 32bit
> bitset to the kernel. This bitset is used to select waiters on the
> futex. The selection is done by a bitwise AND of the wake side
> supplied bitset and the bitset which is stored in the kernel internal
> state of the waiters. If the result is non zero, the waiter is woken,
> otherwise left waiting.

Added (along with quite a bit of other detail).

> [EFAULT] Kernel was unable to access the futex value at uaddr.

Covered already.

> [EINVAL] The supplied uaddr argument does not point to a valid 
> object, i.e. pointer is not 4 byte aligned

Covered already.

> [EINVAL] The supplied bitset is zero.

Added.

> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr and the kernel state, i.e. it detected a waiter
> which waits in FUTEX_LOCK_PI

Added.

==========

> FUTEX_LOCK_PI
> 
> This operation reads from the futex address provided by the uaddr
> argument, which contains the namespace specific TID of the lock
> owner. If the TID is 0, then the kernel tries to set the waiters TID
> atomically. If the TID is nonzero or the take over fails the kernel
> sets atomically the FUTEX_WAITERS bit which signals the owner, that
> it cannot unlock the futex in user space atomically by transitioning
> from TID to 0. After that the kernel tries to find the task which is
> associated to the owner TID, creates or reuses kernel state on behalf
> of the owner and attaches the waiter to it. The enqueing of the 
> waiter is in descending priority order if more than one waiter 
> exists. The owner inherits either the priority or the bandwidth of
> the waiter. This inheritance follows the lock chain in the case of
> nested locking and performs deadlock detection.

Added.

> The timeout argument is handled as described in FUTEX_WAIT. The
> arguments uaddr2, val, and val3 are ignored.

Added. Note, though, that some crufty code gives the impression
that FUTEX_LOCK_PI uses 'val'. I'll send a patch separately.

> Related return values
> 
> [EFAULT] Kernel was unable to access the futex value at uaddr.

Already covered.

> [ENOMEM] Kernel could not allocate state

Added

> [EINVAL] The supplied uaddr argument does not point to a valid 
> object, i.e. pointer is not 4 byte aligned

Already covered.

> [EINVAL] The supplied timeout argument is not normalized.

Already covered.

> [EINVAL]
> The kernel detected inconsistent state between the user space state
> at uaddr and the kernel state. Thats either state corruption or it
> found a waiter on uaddr which is waiting on FUTEX_WAIT[_BITSET]

Added.

> [EPERM]  Caller is not allowed to attach itself to the futex. Can be
> a legitimate issue or a hint for state corruption in user space

Added.

> [ESRCH]	 The TID in the user space value does not exist

Added.

> [EAGAIN] The futex owner TID is about to exit, but has not yet 
> handled the internal state cleanup. Try again.

Added.

> [ETIMEDOUT] timeout expired

Already covered.

> [EDEADLOCK] The futex is already locked by the caller or the kernel 
> detected a deadlock scenario in a nested lock chain

Added.

> [EOWNERDIED] The owner of the futex died and the kernel made the 
> caller the new owner. The kernel sets the FUTEX_OWNER_DIED bit in the
> futex userspace value. Caller is responsible for cleanup

There is no such thing as an EOWNERDIED error. I had a look
through the kernel source for the FUTEX_OWNER_DIED cases and didn't 
see an obvious error associated with them. Can you clarify? (I think 
the point is that this condition, which is described in
Documentation/robust-futexes.txt, is not an error as such. However, I'm
not yet sure of how to describe it in the man page.)
I will add this point as a FIXME in the new draft man page.

> [ENOSYS] Not implemented on all architectures and not supported on
> some CPU variants  (runtime detection)  

Added.

==========

> FUTEX_TRYLOCK_PI
> 
> This operation tries to acquire the futex at uaddr. It deals with the
> situation where the TID value at uaddr is 0, but the FUTEX_HAS_WAITER
> bit is set. User space cannot handle this race free.

Added.

> The arguments uaddr2, val, timeout and val3 are ignored.

??? But the code reads:

        case FUTEX_TRYLOCK_PI:
                return futex_lock_pi(uaddr, flags, 0, timeout, 1);
 
which momentarily misleads one into thinking that 'timeout' is used.
And: it's not quite ignored, since in futex_lock_pi() a non-NULL
'timeout' is unconditionally dereferenced (meaning you could get
an EFAULT error for a bad 'timeout' pointer).
I'm confused....

Maybe the above code should be

        case FUTEX_TRYLOCK_PI:
                return futex_lock_pi(uaddr, flags, 0, NULL, 1);
?

> Return values:
> 
> [EFAULT] Kernel was unable to access the futex value at uaddr.

Already covered.

> [ENOMEM] Kernel could not allocate state

Added.

> [EINVAL] The supplied uaddr argument does not point to a valid 
> object, i.e. pointer is not 4 byte aligned

Already covered.

> [EINVAL] The kernel detected inconsistent state between the user 
> space state at uaddr and the kernel state

Added, but with the same text as for FUTEX_LOCK_PI above. I.e., the text
"Thats either state corruption or it found a waiter on uaddr which is
waiting on FUTEX_WAIT[_BITSET]" is also included.

> [EPERM]  Caller is not allowed to attach itself to the futex. Can be
> a legitimate issue or a hint for state corruption in user space

Added.

> [ESRCH]	 The TID in the user space value does not exist

Added.

> [EAGAIN] The futex owner TID is about to exit, but has not yet 
> handled the internal state cleanup. Try again.

Added.

> [EDEADLOCK] The futex is already locked by the caller.

Added.

> [EOWNERDIED] The owner of the futex died and the kernel made the 
> caller the new owner. The kernel sets the FUTEX_OWNER_DIED bit in the
> futex userspace value. Caller is responsible for cleanup

See comment above concerning EOWNERDIED for FUTEX_LOCK_PI

> [ENOSYS] Not implemented on all architectures and not supported on
> some CPU variants (runtime detection)

Added.

==========

> FUTEX_UNLOCK_PI
> 
> This operation wakes the top priority waiter which is waiting in
> FUTEX_LOCK_PI on the futex address provided by the uaddr argument.
> 
> This is called when the user space value at uaddr cannot be changed
> atomically from TID (of the owner) to 0.
> 
> The arguments uaddr2, val, timeout and val3 are ignored.

Added.

> Related return values:  
> [EINVAL] The kernel detected inconsistent
> state between the user space state at uaddr and the kernel state, 
> i.e. it detected a waiter which waits in FUTEX_WAIT[_BITSET].

Added (but with a question in a FIXME).

> [EPERM]  Caller does not own the futex.

Added.

> [ENOSYS] Not implemented on all architectures and not supported on
> some CPU variants (runtime detection)

Added.

==========

> FUTEX_WAIT_REQUEUE_PI
> 
> Wait operation to wait on a non pi futex at uaddr and potentially be
> requeued on a pi futex at uaddr2. The wait operation on uaddr is the
> same as FUTEX_WAIT. The waiter can be removed from the wait on uaddr
> via FUTEX_WAKE without requeuing on uaddr2.

Added.

> The timeout argument is handled as described in FUTEX_WAIT.

The above seems not to be correct. I've written the discussion of
'timeout' up as I understand it, and added a FIXME to the draft page.

> Darren, can you fill in the missing details?

> Return values:
> 
> [EFAULT] Kernel was unable to access the futex value at uaddr or
> uaddr2

Already covered.

> [EINVAL] The supplied uaddr or uaddr2 argument does not point to a
> valid object, i.e. pointer is not 4 byte aligned

Already covered.

> [EINVAL] The supplied timeout argument is not normalized.

Already covered.

> [EINVAL] The supplied bitset is zero.

??? I don't believe this can happen. 'val3' is internally set to
FUTEX_BITSET_MATCH_ANY. Can you confirm?

> [EWOULDBLOCK] The atomic enqueueing failed. User space value at uaddr
> is not equal val argument.

Added using the same text as FUTEX_WAIT:

       EAGAIN (FUTEX_WAIT, FUTEX_WAIT_REQUEUE_PI) The value pointed to
              by  uaddr was not equal to the expected value val at the
              time of the call.

> [ETIMEDOUT] timeout expired

Already covered.

> [EOWNERDIED] The owner of the PI futex at uaddr2 died and the kernel
> made the caller the new owner. The kernel sets the FUTEX_OWNER_DIED
> bit in the uaddr2 futex userspace value.  Caller is responsible for 
> cleanup

See comment above concerning EOWNERDIED for FUTEX_LOCK_PI

> [ENOSYS] Not implemented on all architectures and not supported on
> some CPU variants (runtime detection)

Added.

==========

> FUTEX_CMP_REQUEUE_PI
> 
> PI aware variant of FUTEX_CMP_REQUEUE. Inner futex at uaddr is a non
> PI futex. Outer futex to which is requeued is a PI futex at uaddr2.

I instead used Darren's proposed text:
 
# PI aware variant for FUTEX_CMP_REQUEUE. Requeue tasks blocked on uaddr via
# FUTEX_WAIT_REQUEUE_PI from a non-PI source futex (uaddr) to a PI target
# futex (uaddr2).

> The waiters on uaddr must wait in FUTEX_WAIT_REQUEUE_PI.

Covered above.

> The argument val is contains the number of waiters on uaddr which are
> immediately woken up. Must be 1 for this opcode.

Added.

> The timeout argument is abused to transport the number of waiters
> which are requeued on to the futex at uaddr2. The pointer is
> typecasted to u32.

Added.

> Darren, can you fill in the missing details?
> 
> [EFAULT] Kernel was unable to access the futex value at uaddr or
> uaddr2

Already covered.

> [ENOMEM] Kernel could not allocate state

Added.

> [EINVAL] The supplied uaddr/uaddr2 arguments do not point to a valid
> object, i.e. pointer is not 4 byte aligned

Already covered.

> [EINVAL] uaddr equal uaddr2. Requeue to same futex.

Added.

> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr and the kernel state, i.e. it detected a waiter
> which waits in FUTEX_LOCK_PI on uaddr

Added

> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr and the kernel state, i.e. it detected a waiter
> which waits in FUTEX_WAIT[_BITSET] on uaddr

Added.

> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr2 and the kernel state, i.e. it detected a waiter
> which waits in FUTEX_WAIT on uaddr2.

Added.

> [EINVAL] The supplied bitset is zero.

Darren Hart noted: Bitset doesn't apply to FUTEX_CMP_REQUEUE_PI.

> [EAGAIN] uaddr1 readout is not equal the compare value in argument
> val3

Added.

> [EAGAIN] The futex owner TID of uaddr2 is about to exit, but has not
> yet handled the internal state cleanup. Try again.

Added.

> [EPERM]  Caller is not allowed to attach the waiter to the futex at
> uaddr2 Can be a legitimate issue or a hint for state corruption in
> user space

Added.

> [ESRCH]	 The TID in the user space value at uaddr2 does not exist

Added.

> [EDEADLOCK] The requeuing of a waiter to the kernel representation of
> the PI futex at uaddr2 detected a deadlock scenario.

Added.

> [ENOSYS] Not implemented on all architectures and not supported on
> some CPU variants (runtime detection)

Added.

==========

> The various option bits seem to be undocumented as well

Yes, thanks for that.

> FUTEX_PRIVATE_FLAG

I've added this one, along with the detail "(since Linux 2.6.22)"

> This option bit can be ored on all futex ops.
> 
> It tells the kernel, that the futex is process private and not shared
> with another process. That allows the kernel to chose the fast path
> for validating the user space address and avoids expensive VMA
> lookup, taking refcounts on file backing store etc.
> 
> FUTEX_CLOCK_REALTIME

I've added this one, along with the detail "(since Linux 2.6.28)"

> This option bit can be ored on the futex ops FUTEX_WAIT_BITSET and
> FUTEX_WAIT_REQUEUE_PI
> 
> If set the kernel treats the user space supplied timeout as absolute
> time based on CLOCK_REALTIME.
> 
> If not set the kernel treats the user space supplied timeout as
> relative time.
> 
> If this is set on any other op than the supported ones, kernel 
> returns ENOSYS!

The details in the preceding 4 paragraphs have been integrated.

Thanks,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 20:35             ` Darren Hart
@ 2015-01-15 15:12               ` Michael Kerrisk (man-pages)
  2015-01-17  1:33                 ` Darren Hart
  0 siblings, 1 reply; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-15 15:12 UTC (permalink / raw)
  To: Darren Hart, Thomas Gleixner
  Cc: mtk.manpages, Carlos O'Donell, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API

Hello Darren,

I give you the same apology as to Thomas for the 
long-delayed response to your mail.

And I repeat my note to Thomas:
In the next day or two, I hope to send out the new version
of the futex(2) page for review. The new draft is a bit
bigger (okay -- 4 x bigger) than the current page. And there 
are a quite number of FIXMEs that I've placed in the page 
for various points--some minor, but a few major--that need
to be checked or fixed. Would you have some time to review
that page? 

In the meantime, I have a couple of questions, which, if 
you could answer them, I would work some changes into the 
page before sending.

1. In various places, distinction is made between non-PI 
   futexs and PI futexes. But what determines that distinction?
   From the kernel's perspective, hat make a futex one type
   or another? I presume it is to do with the types of blocking
   waiters on the futex, but it would be good to have a formal
   definition.

2. Can you say something about the pairing requirements of
   FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI. 
   What is the requirement and why do we need it?

Most of the rest of this mail is just a checklist noting
what I did with your comments. No response is needed 
in most cases, but there is one that I have marked with
"???". If you could reply to that. I'd be grateful.

On 05/15/2014 10:35 PM, Darren Hart wrote:
> On 5/15/14, 7:14, "Thomas Gleixner" <tglx@linutronix.de> wrote:
> 
> Wow Thomas, I planned to do exactly this and you beat me to it. Again.
> Thanks for getting this started.
> 
> Michael, I imagine you want something more condensed, and I'll add to what
> tglx posted (inline below) to try and get you that, but if you have
> questions and need to fill in the gap, the paper I presented at RTLWS11 in
> '09 covers this particularly nasty OPCODE in detail:
> 
> http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
> 
> I believe Michael is looking for some higher level documentation, like how
> to use these and what they are intended for. 

Yes, that would be good.

> Probably something more like
> Ulrich's Futexes are Tricky paper - but let's start with getting the op
> codes, arguments, and return codes fleshed out.

Okay.

> For all the PI opcodes, we should probably mention something about the
> futex value scheme (TID), whereas the other opcodes do not require any
> specific value scheme.
> 
> No Owner:	0
> Owner:		TID
> Waiters:	TID | FUTEX_WAITERS
> 
> This is the relevant section from the referenced paper:
> 				
> The PI futex operations diverge from the oth-
> ers in that they impose a policy describing how
> the futex value is to be used. If the lock is un-
> owned, the futex value shall be 0. If owned, it
> shall be the thread id (tid) of the owning thread.
> If there are threads contending for the lock, then
> the FUTEX_WAITERS flag is set. With this policy in
> place, userspace can atomically acquire an unowned
> lock or release an uncontended lock using an atomic
> instruction and their own tid. A non-zero futex
> value will force waiters into the kernel to lock. The
> FUTEX_WAITERS flag forces the owner into the kernel
> to unlock. If the callers are forced into the kernel,
> they then deal directly with an underlying rt_mutex
> which implements the priority inheritance semantics.
> After the rt_mutex is acquired, the futex value is up-
> dated accordingly, before the calling thread returns
> to userspace.
>
> It is important to note that the kernel will update the futex value prior
> to returning to userspace. Unlike other futex op codes,
> FUTEX_CMP_REUQUE_PI (and FUTEX_WAIT_REQUEUE_PI, FUTEX_LOCK_PI are designed
> for the implementation of very specific IPC mechanisms).

??? Great text. May I presume that I can take this text 
and freely adapt it for the man page? (Actually, this is a 
request for forgiveness, rather than permission :-).)

>> FUTEX_CMP_REQUEUE_PI
>>
>> 	PI aware variant of FUTEX_CMP_REQUEUE. Inner futex at uaddr is
>> 	a non PI futex. Outer futex to which is requeued is a PI futex
>> 	at uaddr2.
> 
> Inner/outer terminology applies specifically to the glibc pthread
> condition variable and mutex use case, but is overly specific for the man
> page. Consider:
> 
> PI aware variant for FUTEX_CMP_REQUEUE. Requeue tasks blocked on uaddr via
> FUTEX_WAIT_REQUEUE_PI from a non-PI source futex (uaddr) to a PI target
> futex (uaddr2).

Thanks for that text. It is easier to grasp.

>>
>> 	The waiters on uaddr must wait in FUTEX_WAIT_REQUEUE_PI.
>>
>> 	The argument val is contains the number of waiters on uaddr
>> 	which are immediately woken up. Must be 1 for this opcode.
> 
> Because the point is to avoid the thundering herd in the first place, and
> other nasty little races and faulting corner cases...

I added the piece about "thundering herd".

>> 	The timeout argument is abused to transport the number of
>> 	waiters which are requeued on to the futex at uaddr2. The
>> 	pointer is typecasted to u32.
> 
> 
>           val3 contains the expected value of uaddr (same as
> FUTEX_CMP_REQUEUE)

Yes. (The text now says that 'val3' has the same purpose as 
for FUTEX_CMP_REQUEUE.)

>> Darren, can you fill in the missing details?
> 
> Yup...
> 
>>
>> 	[EFAULT] Kernel was unable to access the futex value at uaddr
>> 		 or uaddr2
>>
>> 	[ENOMEM] Kernel could not allocate state
>>
>> 	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
>> 		 valid object, i.e. pointer is not 4 byte aligned
>>
>> 	[EINVAL] uaddr equal uaddr2. Requeue to same futex.
>>
>> 	[EINVAL] The kernel detected inconsistent state between the
>> 		 user space state at uaddr and the kernel state,
>> 		 i.e. it detected a waiter which waits in
>> 		 FUTEX_LOCK_PI on uaddr
> 
>                    instead of FUTEX_WAIT_REQUEUE_PI.

Thanks. I added that detail.

>> 	[EINVAL] The kernel detected inconsistent state between the
>> 		 user space state at uaddr and the kernel state,
>> 		 i.e. it detected a waiter which waits in
>> 		 FUTEX_WAIT[_BITSET] on uaddr
>>
>> 	[EINVAL] The kernel detected inconsistent state between the
>> 		 user space state at uaddr2 and the kernel state,
>> 		 i.e. it detected a waiter which waits in
>> 		 FUTEX_WAIT on uaddr2.
> 
>           [EINVAL] The kernel detected the FUTEX_CMP_REQUEUE_PI call is
>                    attempting to requeue a task to a futex other than that
>                    specified by the matching FUTEX_WAIT_REQUEUE_PI call for
>                    that task.

Thanks. Added.

> A number of these EINVALs can probably be combined into "Kernel detected
> bad state" as far as the C library is concerned, but we can consolidate
> later. But basically, EINVAL is returned if the non-pi to pi or op pairing
> semantics are violated.

I think the page probably needs some text to cover that point. I'll add
a FIXME for review.

>>  	[EINVAL] The supplied bitset is zero.
> 
> Bitset doesn't apply to FUTEX_CMP_REQUEUE_PI.

Thanks.

>           [EINVAL] nr_wake != 1

Thanks, I'd already spotted this, but it's good to have confirmation.

> EAGAIN == EWOULDBLOCK. We use each in the kernel, but will just refer to
> them here as EAGAIN.

Yes. And I've followed that convention now in the man page.

>> 	[EAGAIN] uaddr1 readout is not equal the compare value in
>> 		 argument val3
>>
>> 	[EAGAIN] The futex owner TID of uaddr2 is about to exit, but
>> 		 has not yet handled the internal state cleanup. Try
>> 		 again.
>>
>> 	[EPERM]  Caller is not allowed to attach the waiter to the
>> 		 futex at uaddr2 Can be a legitimate issue or a hint
>> 		 for state corruption in user space
>>
>> 	[ESRCH]	 The TID in the user space value at uaddr2 does not exist
> 
> Hrm, I'm missing ESRCH and EPERM in my state diagrams.... put yes, we can
> get ESRCH when looking up PI state, and we can return that from
> futex_requeue.... That needs some time to review...
> 
> I'm not seeing the EPERM path, where is that coming from?

Any further insight on the above?

>> 	[EDEADLOCK] The requeuing of a waiter to the kernel representation
>> 		    of the PI futex at uaddr2 detected a deadlock scenario.
>>
>>        [ENOSYS] Not implemented on all architectures and not supported
>> 		 on some CPU variants (runtime detection)
> 
> Return value >= 0 is successful, indicating the number of of tasks
> requeued or woken (3 requeued and 1 woken would return 4).

Yes. Already noted.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-15 15:10             ` Michael Kerrisk (man-pages)
@ 2015-01-15 22:23               ` Thomas Gleixner
  2015-01-16 15:17                 ` Michael Kerrisk (man-pages)
  2015-01-23 18:29               ` Torvald Riegel
  1 sibling, 1 reply; 80+ messages in thread
From: Thomas Gleixner @ 2015-01-15 22:23 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Carlos O'Donell, Darren Hart, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Torvald Riegel, Roland McGrath,
	Darren Hart, Anton Blanchard, Peter Zijlstra, Petr Baudis,
	Eric Dumazet, bill o gallmeister, Jan Kiszka, Daniel Wagner,
	Rich Felker

On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
> > [EINVAL] uaddr equal uaddr2. Requeue to same futex.
> 
> ??? I added this, but does this error not occur only for PI requeues?

It's equally wrong for normal futexes. And its actually the same code
checking for this for all variants.

> > [EDEADLOCK] The futex is already locked by the caller or the kernel 
> > detected a deadlock scenario in a nested lock chain
>
> Added.

It's actually EDEADLK

> 
> > [EOWNERDIED] The owner of the futex died and the kernel made the 
> > caller the new owner. The kernel sets the FUTEX_OWNER_DIED bit in the
> > futex userspace value. Caller is responsible for cleanup
> 
> There is no such thing as an EOWNERDIED error. I had a look
> through the kernel source for the FUTEX_OWNER_DIED cases and didn't 
> see an obvious error associated with them. Can you clarify? (I think 
> the point is that this condition, which is described in
> Documentation/robust-futexes.txt, is not an error as such. However, I'm
> not yet sure of how to describe it in the man page.)
> I will add this point as a FIXME in the new draft man page.

Oops. My bad. That's not the what the kernel does. The kernel merily
marks it in the futex itself with FUTEX_OWNER_DIED. User space needs
to deal with that and the posix users return EOWNERDEAD (not
EOWNERDIED], so it's not part of the futex call itself.

We had discussions about returning EOWNERDEAD in that case, but then
glibc with its sophisticated error handling prevented that ....
 
> > FUTEX_TRYLOCK_PI
> > 
> > This operation tries to acquire the futex at uaddr. It deals with the
> > situation where the TID value at uaddr is 0, but the FUTEX_HAS_WAITER
> > bit is set. User space cannot handle this race free.
> 
> Added.
> 
> > The arguments uaddr2, val, timeout and val3 are ignored.
> 
> ??? But the code reads:
> 
>         case FUTEX_TRYLOCK_PI:
>                 return futex_lock_pi(uaddr, flags, 0, timeout, 1);
>  
> which momentarily misleads one into thinking that 'timeout' is used.
> And: it's not quite ignored, since in futex_lock_pi() a non-NULL
> 'timeout' is unconditionally dereferenced (meaning you could get
> an EFAULT error for a bad 'timeout' pointer).
> I'm confused....

Indeed. That's just wrong.
 
> Maybe the above code should be
> 
>         case FUTEX_TRYLOCK_PI:
>                 return futex_lock_pi(uaddr, flags, 0, NULL, 1);
> ?

Care to send a patch?
 
> > FUTEX_WAIT_REQUEUE_PI
> > 
> > Wait operation to wait on a non pi futex at uaddr and potentially be
> > requeued on a pi futex at uaddr2. The wait operation on uaddr is the
> > same as FUTEX_WAIT. The waiter can be removed from the wait on uaddr
> > via FUTEX_WAKE without requeuing on uaddr2.
> 
> Added.
> 
> > The timeout argument is handled as described in FUTEX_WAIT.
> 
> The above seems not to be correct. I've written the discussion of
> 'timeout' up as I understand it, and added a FIXME to the draft page.
> 
> > Darren, can you fill in the missing details?
> 
> > Return values:
> > 
> > [EFAULT] Kernel was unable to access the futex value at uaddr or
> > uaddr2
> 
> Already covered.
> 
> > [EINVAL] The supplied uaddr or uaddr2 argument does not point to a
> > valid object, i.e. pointer is not 4 byte aligned
> 
> Already covered.
> 
> > [EINVAL] The supplied timeout argument is not normalized.
> 
> Already covered.
> 
> > [EINVAL] The supplied bitset is zero.
> 
> ??? I don't believe this can happen. 'val3' is internally set to
> FUTEX_BITSET_MATCH_ANY. Can you confirm?

Right. We dont support that bitset stuff in requeue_pi ATM.
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-15 22:23               ` Thomas Gleixner
@ 2015-01-16 15:17                 ` Michael Kerrisk (man-pages)
  2015-01-16 15:20                   ` Thomas Gleixner
  0 siblings, 1 reply; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-16 15:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mtk.manpages, Carlos O'Donell, Darren Hart, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API, Torvald Riegel,
	Roland McGrath, Darren Hart, Anton Blanchard, Petr Baudis,
	Eric Dumazet, bill o gallmeister, Jan Kiszka, Daniel Wagner,
	Rich Felker

Hello Thomas,

On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
> On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
>>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>>
>> ??? I added this, but does this error not occur only for PI requeues?
> 
> It's equally wrong for normal futexes. And its actually the same code
> checking for this for all variants.

I don't understand "equally wrong" in your reply, I'm sorry. Do you
mean:

a) This error text should be there for both normal and PI requeues
OR
a) This error text should be there for neither normal nor PI requeues

>>> [EDEADLOCK] The futex is already locked by the caller or the kernel 
>>> detected a deadlock scenario in a nested lock chain
>>
>> Added.
> 
> It's actually EDEADLK

Yes, sorry -- I should have said that I already found and fixed 
that problem.

>>> [EOWNERDIED] The owner of the futex died and the kernel made the 
>>> caller the new owner. The kernel sets the FUTEX_OWNER_DIED bit in the
>>> futex userspace value. Caller is responsible for cleanup
>>
>> There is no such thing as an EOWNERDIED error. I had a look
>> through the kernel source for the FUTEX_OWNER_DIED cases and didn't 
>> see an obvious error associated with them. Can you clarify? (I think 
>> the point is that this condition, which is described in
>> Documentation/robust-futexes.txt, is not an error as such. However, I'm
>> not yet sure of how to describe it in the man page.)
>> I will add this point as a FIXME in the new draft man page.
> 
> Oops. My bad. That's not the what the kernel does. The kernel merily
> marks it in the futex itself with FUTEX_OWNER_DIED. User space needs
> to deal with that and the posix users return EOWNERDEAD (not
> EOWNERDIED], so it's not part of the futex call itself.
> 
> We had discussions about returning EOWNERDEAD in that case, but then
> glibc with its sophisticated error handling prevented that ....

Okay. I'll add a FIXME to the draft page, to see if we get some good 
text together to describe FUTEX_OWNER_DIED and how it is used.

>>> FUTEX_TRYLOCK_PI
>>>
>>> This operation tries to acquire the futex at uaddr. It deals with the
>>> situation where the TID value at uaddr is 0, but the FUTEX_HAS_WAITER
>>> bit is set. User space cannot handle this race free.
>>
>> Added.
>>
>>> The arguments uaddr2, val, timeout and val3 are ignored.
>>
>> ??? But the code reads:
>>
>>         case FUTEX_TRYLOCK_PI:
>>                 return futex_lock_pi(uaddr, flags, 0, timeout, 1);
>>  
>> which momentarily misleads one into thinking that 'timeout' is used.
>> And: it's not quite ignored, since in futex_lock_pi() a non-NULL
>> 'timeout' is unconditionally dereferenced (meaning you could get
>> an EFAULT error for a bad 'timeout' pointer).
>> I'm confused....
> 
> Indeed. That's just wrong.
>  
>> Maybe the above code should be
>>
>>         case FUTEX_TRYLOCK_PI:
>>                 return futex_lock_pi(uaddr, flags, 0, NULL, 1);
>> ?
> 
> Care to send a patch?

Will do.
  
[...]

>> ??? I don't believe this can happen. 'val3' is internally set to
>> FUTEX_BITSET_MATCH_ANY. Can you confirm?
> 
> Right. We dont support that bitset stuff in requeue_pi ATM.

Thanks for the confirmation.

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-16 15:17                 ` Michael Kerrisk (man-pages)
@ 2015-01-16 15:20                   ` Thomas Gleixner
  2015-01-16 20:54                     ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 80+ messages in thread
From: Thomas Gleixner @ 2015-01-16 15:20 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Carlos O'Donell, Darren Hart, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Torvald Riegel, Roland McGrath,
	Darren Hart, Anton Blanchard, Petr Baudis, Eric Dumazet,
	bill o gallmeister, Jan Kiszka, Daniel Wagner, Rich Felker

On Fri, 16 Jan 2015, Michael Kerrisk (man-pages) wrote:

> Hello Thomas,
> 
> On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
> > On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
> >>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
> >>
> >> ??? I added this, but does this error not occur only for PI requeues?
> > 
> > It's equally wrong for normal futexes. And its actually the same code
> > checking for this for all variants.
> 
> I don't understand "equally wrong" in your reply, I'm sorry. Do you
> mean:
> 
> a) This error text should be there for both normal and PI requeues

It is there for both. The requeue code has that check independent of
the requeue type (normal/pi). It never makes sense to requeue
something to itself whether normal or pi futex. We added this for PI,
because there it is harmful, but we did not special case it. So normal
futexes get the same treatment.

Thanks,

	tglx




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-16 15:20                   ` Thomas Gleixner
@ 2015-01-16 20:54                     ` Michael Kerrisk (man-pages)
  2015-01-17  0:46                       ` Darren Hart
  2015-01-17  0:56                       ` Davidlohr Bueso
  0 siblings, 2 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-16 20:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mtk.manpages, Carlos O'Donell, Darren Hart, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API, Torvald Riegel,
	Roland McGrath, Darren Hart, Anton Blanchard, Petr Baudis,
	Eric Dumazet, bill o gallmeister, Jan Kiszka, Daniel Wagner,
	Rich Felker

On 01/16/2015 04:20 PM, Thomas Gleixner wrote:
> On Fri, 16 Jan 2015, Michael Kerrisk (man-pages) wrote:
> 
>> Hello Thomas,
>>
>> On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
>>> On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
>>>>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>>>>
>>>> ??? I added this, but does this error not occur only for PI requeues?
>>>
>>> It's equally wrong for normal futexes. And its actually the same code
>>> checking for this for all variants.
>>
>> I don't understand "equally wrong" in your reply, I'm sorry. Do you
>> mean:
>>
>> a) This error text should be there for both normal and PI requeues
> 
> It is there for both. The requeue code has that check independent of
> the requeue type (normal/pi). It never makes sense to requeue
> something to itself whether normal or pi futex. We added this for PI,
> because there it is harmful, but we did not special case it. So normal
> futexes get the same treatment.

Hello Thomas, 

Color me stupid, but I can't see this in futex_requeue(). Where is that
check that is "independent of the requeue type (normal/pi)"?

When I look through futex_requeue(), all the likely looking sources
of EINVAL are governed by a check on the 'requeue_pi' argument.

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-16 20:54                     ` Michael Kerrisk (man-pages)
@ 2015-01-17  0:46                       ` Darren Hart
  2015-01-19 10:45                         ` Thomas Gleixner
  2015-01-23 18:19                         ` Torvald Riegel
  2015-01-17  0:56                       ` Davidlohr Bueso
  1 sibling, 2 replies; 80+ messages in thread
From: Darren Hart @ 2015-01-17  0:46 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages), Thomas Gleixner
  Cc: Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Torvald Riegel, Roland McGrath, Darren Hart,
	Anton Blanchard, Petr Baudis, Eric Dumazet, bill o gallmeister,
	Jan Kiszka, Daniel Wagner, Rich Felker





On 1/16/15, 12:54 PM, "Michael Kerrisk (man-pages)"
<mtk.manpages@gmail.com> wrote:

>On 01/16/2015 04:20 PM, Thomas Gleixner wrote:
>> On Fri, 16 Jan 2015, Michael Kerrisk (man-pages) wrote:
>> 
>>> Hello Thomas,
>>>
>>> On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
>>>> On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
>>>>>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>>>>>
>>>>> ??? I added this, but does this error not occur only for PI requeues?
>>>>
>>>> It's equally wrong for normal futexes. And its actually the same code
>>>> checking for this for all variants.
>>>
>>> I don't understand "equally wrong" in your reply, I'm sorry. Do you
>>> mean:
>>>
>>> a) This error text should be there for both normal and PI requeues
>> 
>> It is there for both. The requeue code has that check independent of
>> the requeue type (normal/pi). It never makes sense to requeue
>> something to itself whether normal or pi futex. We added this for PI,
>> because there it is harmful, but we did not special case it. So normal
>> futexes get the same treatment.
>
>Hello Thomas, 
>
>Color me stupid, but I can't see this in futex_requeue(). Where is that
>check that is "independent of the requeue type (normal/pi)"?
>
>When I look through futex_requeue(), all the likely looking sources
>of EINVAL are governed by a check on the 'requeue_pi' argument.


Right, in the non-PI case, I believe there are valid use cases: move to
the back of the FIFO, for example (OK, maybe the only example?). Both
tests ensuring uaddr1 != uaddr2 are under the requeue_pi conditional
block. The second compares the keys in case they are not FUTEX_PRIVATE
(uaddrs would be different, but still the same backing store).

Thomas, am I missing a test for this someplace else?


-- 
Darren Hart
Intel Open Source Technology Center




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-16 20:54                     ` Michael Kerrisk (man-pages)
  2015-01-17  0:46                       ` Darren Hart
@ 2015-01-17  0:56                       ` Davidlohr Bueso
  2015-01-17  1:11                         ` Darren Hart
  1 sibling, 1 reply; 80+ messages in thread
From: Davidlohr Bueso @ 2015-01-17  0:56 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Thomas Gleixner, Carlos O'Donell, Darren Hart, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Torvald Riegel, Roland McGrath,
	Darren Hart, Anton Blanchard, Petr Baudis, Eric Dumazet,
	bill o gallmeister, Jan Kiszka, Daniel Wagner, Rich Felker

On Fri, 2015-01-16 at 21:54 +0100, Michael Kerrisk (man-pages) wrote:
> On 01/16/2015 04:20 PM, Thomas Gleixner wrote:
> > On Fri, 16 Jan 2015, Michael Kerrisk (man-pages) wrote:
> > 
> >> Hello Thomas,
> >>
> >> On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
> >>> On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
> >>>>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
> >>>>
> >>>> ??? I added this, but does this error not occur only for PI requeues?
> >>>
> >>> It's equally wrong for normal futexes. And its actually the same code
> >>> checking for this for all variants.
> >>
> >> I don't understand "equally wrong" in your reply, I'm sorry. Do you
> >> mean:
> >>
> >> a) This error text should be there for both normal and PI requeues
> > 
> > It is there for both. The requeue code has that check independent of
> > the requeue type (normal/pi). It never makes sense to requeue
> > something to itself whether normal or pi futex. We added this for PI,
> > because there it is harmful, but we did not special case it. So normal
> > futexes get the same treatment.
> 
> Hello Thomas, 
> 
> Color me stupid, but I can't see this in futex_requeue(). Where is that
> check that is "independent of the requeue type (normal/pi)"?
> 
> When I look through futex_requeue(), all the likely looking sources
> of EINVAL are governed by a check on the 'requeue_pi' argument.

Yeah, its not very straightforward, I was also scratching my head. First
we do:

	if (requeue_pi) {
		/*
		 * Requeue PI only works on two distinct uaddrs. This
		 * check is only valid for private futexes. See below.
		 */
		if (uaddr1 == uaddr2)
			return -EINVAL;

Then:

	/*
	 * The check above which compares uaddrs is not sufficient for
	 * shared futexes. We need to compare the keys:
	 */
	if (requeue_pi && match_futex(&key1, &key2)) {
		ret = -EINVAL;
		goto out_put_keys;
	}

I wonder why we're checking for requeue_pi again, when, at least
according to the comments, it should be for shared. I guess it would
make sense depending on the mappings as the keys are the only true way
of determining if both futexes are the same, so perhaps:

	if ((requeue_pi || (flags & FLAGS_SHARED)) && match_futex())

That would also align with the retry labels.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-17  0:56                       ` Davidlohr Bueso
@ 2015-01-17  1:11                         ` Darren Hart
  0 siblings, 0 replies; 80+ messages in thread
From: Darren Hart @ 2015-01-17  1:11 UTC (permalink / raw)
  To: Davidlohr Bueso, Michael Kerrisk (man-pages)
  Cc: Thomas Gleixner, Carlos O'Donell, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Torvald Riegel, Roland McGrath, Darren Hart,
	Anton Blanchard, Petr Baudis, Eric Dumazet, bill o gallmeister,
	Jan Kiszka, Daniel Wagner, Rich Felker

On 1/16/15, 4:56 PM, "Davidlohr Bueso" <dave@stgolabs.net> wrote:


>On Fri, 2015-01-16 at 21:54 +0100, Michael Kerrisk (man-pages) wrote:
>> On 01/16/2015 04:20 PM, Thomas Gleixner wrote:
>> > On Fri, 16 Jan 2015, Michael Kerrisk (man-pages) wrote:
>> > 
>> >> Hello Thomas,
>> >>
>> >> On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
>> >>> On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
>> >>>>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>> >>>>
>> >>>> ??? I added this, but does this error not occur only for PI
>>requeues?
>> >>>
>> >>> It's equally wrong for normal futexes. And its actually the same
>>code
>> >>> checking for this for all variants.
>> >>
>> >> I don't understand "equally wrong" in your reply, I'm sorry. Do you
>> >> mean:
>> >>
>> >> a) This error text should be there for both normal and PI requeues
>> > 
>> > It is there for both. The requeue code has that check independent of
>> > the requeue type (normal/pi). It never makes sense to requeue
>> > something to itself whether normal or pi futex. We added this for PI,
>> > because there it is harmful, but we did not special case it. So normal
>> > futexes get the same treatment.
>> 
>> Hello Thomas, 
>> 
>> Color me stupid, but I can't see this in futex_requeue(). Where is that
>> check that is "independent of the requeue type (normal/pi)"?
>> 
>> When I look through futex_requeue(), all the likely looking sources
>> of EINVAL are governed by a check on the 'requeue_pi' argument.
>
>Yeah, its not very straightforward, I was also scratching my head. First
>we do:
>
>	if (requeue_pi) {
>		/*
>		 * Requeue PI only works on two distinct uaddrs. This
>		 * check is only valid for private futexes. See below.
>		 */
>		if (uaddr1 == uaddr2)
>			return -EINVAL;

We check here to abort as early as possible for the usual security reasons.

>
>Then:
>
>	/*
>	 * The check above which compares uaddrs is not sufficient for
>	 * shared futexes. We need to compare the keys:
>	 */
>	if (requeue_pi && match_futex(&key1, &key2)) {
>		ret = -EINVAL;
>		goto out_put_keys;
>	}
>
>I wonder why we're checking for requeue_pi again, when, at least
>according to the comments, it should be for shared. I guess it would
>make sense depending on the mappings as the keys are the only true way
>of determining if both futexes are the same, so perhaps:
>
>	if ((requeue_pi || (flags & FLAGS_SHARED)) && match_futex())

No, the rule only applies to requeue_pi. This check is the for-sure
version of the uaddr comparison above. We could add if flags &
FLAGS_SHARED, but I'm not sure it's worth it.

--
Darren Hart
Intel Open Source Technology Center




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-15 15:12               ` Michael Kerrisk (man-pages)
@ 2015-01-17  1:33                 ` Darren Hart
  2015-01-17  9:16                   ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 80+ messages in thread
From: Darren Hart @ 2015-01-17  1:33 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages), Thomas Gleixner
  Cc: Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Arnd Bergmann, Steven Rostedt, Peter Zijlstra, Linux API,
	Davidlohr Bueso

Corrected Davidlohr's email address.

On 1/15/15, 7:12 AM, "Michael Kerrisk (man-pages)"
<mtk.manpages@gmail.com> wrote:

>Hello Darren,
>
>I give you the same apology as to Thomas for the
>long-delayed response to your mail.
>
>And I repeat my note to Thomas:
>In the next day or two, I hope to send out the new version
>of the futex(2) page for review. The new draft is a bit
>bigger (okay -- 4 x bigger) than the current page. And there
>are a quite number of FIXMEs that I've placed in the page
>for various points--some minor, but a few major--that need
>to be checked or fixed. Would you have some time to review
>that page?

I'll make the time for that. I've wanted to see this for a while, so thank
you for working on it!

> 
>
>In the meantime, I have a couple of questions, which, if
>you could answer them, I would work some changes into the
>page before sending.
>
>1. In various places, distinction is made between non-PI
>   futexs and PI futexes. But what determines that distinction?
>   From the kernel's perspective, hat make a futex one type
>   or another? I presume it is to do with the types of blocking
>   waiters on the futex, but it would be good to have a formal
>   definition.

You're right in that a uaddr is a uaddr is a uaddr. Also "there is no such
thing as a futex", it doesn't exist as any kind of identifiable object, so
these discussions can get rather confusing :-)

A "futex" becomes a PI futex when it is "created" via a PI futex op code.
At that point, the syscall will ensure a pi_state is populated for the
futex_q entry. See futex_lock_pi() for example. Before the locks are
taken, there is a call to refill_pi_state_cache() which preps a pi_state
for assignment later in futex_lock_pi_atomic(). This pi_state provides the
necessary linkage to perform the priority boosting in the event of a
priority inversion. This is handled externally from the futexes via the
rt_mutex construct.

Clear as mud?


>
>2. Can you say something about the pairing requirements of
>   FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI.
>   What is the requirement and why do we need it?

Briefly, these op codes exist to support a fairly specific use case:
support for PI aware pthread condvars (glibc patch acceptance STILL
PENDING FOR LOVE OF EVERYTHING HOLY WHY?!?!?! But is shipped with various
PREEMPT_RT enabled Linux systems. Because these calls are paired, and more
of the logic can happen on the kernel side (to preserve ownership of an
rt_mutex with waiters), so in order to ensure userspace and kernelspace
remain in sync, we pre-specify the target of the requeue in
futex_wait_requeue_pi. This also limits the attack surface by only
supporting exactly what it was meant to do. The corner cases get insane
otherwise.

We could walk through the various ways in which it would break if these
pairing restrictions were not in place, but I'll have to take some serious
time to page all those into working memory. Let me know if we need more
detail here and I will.

>
>Most of the rest of this mail is just a checklist noting
>what I did with your comments. No response is needed
>in most cases, but there is one that I have marked with
>"???". If you could reply to that. I'd be grateful.

...

>> For all the PI opcodes, we should probably mention something about the
>> futex value scheme (TID), whereas the other opcodes do not require any
>> specific value scheme.
>> 
>> No Owner:	0
>> Owner:		TID
>> Waiters:	TID | FUTEX_WAITERS
>> 
>> This is the relevant section from the referenced paper:
>> 				
>> The PI futex operations diverge from the oth-
>> ers in that they impose a policy describing how
>> the futex value is to be used. If the lock is un-
>> owned, the futex value shall be 0. If owned, it
>> shall be the thread id (tid) of the owning thread.
>> If there are threads contending for the lock, then
>> the FUTEX_WAITERS flag is set. With this policy in
>> place, userspace can atomically acquire an unowned
>> lock or release an uncontended lock using an atomic
>> instruction and their own tid. A non-zero futex
>> value will force waiters into the kernel to lock. The
>> FUTEX_WAITERS flag forces the owner into the kernel
>> to unlock. If the callers are forced into the kernel,
>> they then deal directly with an underlying rt_mutex
>> which implements the priority inheritance semantics.
>> After the rt_mutex is acquired, the futex value is up-
>> dated accordingly, before the calling thread returns
>> to userspace.
>>
>> It is important to note that the kernel will update the futex value
>>prior
>> to returning to userspace. Unlike other futex op codes,
>> FUTEX_CMP_REUQUE_PI (and FUTEX_WAIT_REQUEUE_PI, FUTEX_LOCK_PI are
>>designed
>> for the implementation of very specific IPC mechanisms).
>
>??? Great text. May I presume that I can take this text
>and freely adapt it for the man page? (Actually, this is a
>request for forgiveness, rather than permission :-).)

Thanks, and no objection from me.

--
Darren Hart
Intel Open Source Technology Center



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-17  1:33                 ` Darren Hart
@ 2015-01-17  9:16                   ` Michael Kerrisk (man-pages)
  2015-01-17 19:26                     ` Darren Hart
  0 siblings, 1 reply; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-17  9:16 UTC (permalink / raw)
  To: Darren Hart, Thomas Gleixner
  Cc: mtk.manpages, Carlos O'Donell, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Davidlohr Bueso, Jan Kiszka

Hello Darren,

On 01/17/2015 02:33 AM, Darren Hart wrote:
> Corrected Davidlohr's email address.

Thanks!

> On 1/15/15, 7:12 AM, "Michael Kerrisk (man-pages)"
> <mtk.manpages@gmail.com> wrote:
> 
>> Hello Darren,
>>
>> I give you the same apology as to Thomas for the
>> long-delayed response to your mail.
>>
>> And I repeat my note to Thomas:
>> In the next day or two, I hope to send out the new version
>> of the futex(2) page for review. The new draft is a bit
>> bigger (okay -- 4 x bigger) than the current page. And there
>> are a quite number of FIXMEs that I've placed in the page
>> for various points--some minor, but a few major--that need
>> to be checked or fixed. Would you have some time to review
>> that page?
> 
> I'll make the time for that. I've wanted to see this for a while, so thank
> you for working on it!

Great!

>> In the meantime, I have a couple of questions, which, if
>> you could answer them, I would work some changes into the
>> page before sending.
>>
>> 1. In various places, distinction is made between non-PI
>>   futexs and PI futexes. But what determines that distinction?
>>   From the kernel's perspective, hat make a futex one type
>>   or another? I presume it is to do with the types of blocking
>>   waiters on the futex, but it would be good to have a formal
>>   definition.
> 
> You're right in that a uaddr is a uaddr is a uaddr. Also "there is no such
> thing as a futex", it doesn't exist as any kind of identifiable object, so
> these discussions can get rather confusing :-)

So, I want to make sure that I am clear on what you mean you say this.
You say "there is no such thing as a futex" because from the kernel's
perspective there is no visible entity in the uncontended case
(where everything can be dealt with in user space). And from user-space,
in the uncontended case all we're doing is memory operations. Right?

On the other hand, from a kernel perspective, we could say that a 
futex "exists" in the contended phases, since the kernel has allocated
state associated with the uaddr. Right?

> A "futex" becomes a PI futex when it is "created" via a PI futex op code.

Precisely which PI op codes? Is it: FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI, and
FUTEX_CMP_REQUEUE_PI, and not FUTEX_WAIT_REQUEUE_PI or FUTEX_UNLOCK_PI?

> At that point, the syscall will ensure a pi_state is populated for the
> futex_q entry. See futex_lock_pi() for example. Before the locks are
> taken, there is a call to refill_pi_state_cache() which preps a pi_state
> for assignment later in futex_lock_pi_atomic(). This pi_state provides the
> necessary linkage to perform the priority boosting in the event of a
> priority inversion. This is handled externally from the futexes via the
> rt_mutex construct.
> 
> Clear as mud?

Not quite that bad, but... The thing is, still, the man page has text
such as the following (based on your wording):

       FUTEX_CMP_REQUEUE_PI (since Linux 2.6.31)
              This operation is a PI-aware variant of FUTEX_CMP_REQUEUE.
              It    requeues    waiters    that    are    blocked    via
              FUTEX_WAIT_REQUEUE_PI  on uaddr from a non-PI source futex
              (uaddr) to a PI target futex (uaddr2).

And elsewhere you said

    EINVAL is returned if the non-pi to pi or 
    op pairing semantics are violated.

When someone in user-land (e.g., me) reads pieces like that, they then 
want to find somewhere in the man page a description of what makes a 
futex a *PI futex* and probably some statements of the distinction 
between PI and non-PI futexes. And those statements should be from a 
perspective that is somewhat comprehensible to user-space. I'm not
yet confident that I can do that. Do you care to take a shot at it?

>> 2. Can you say something about the pairing requirements of
>>   FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI.
>>   What is the requirement and why do we need it?
> 
> Briefly, these op codes exist to support a fairly specific use case:
> support for PI aware pthread condvars (glibc patch acceptance STILL
> PENDING FOR LOVE OF EVERYTHING HOLY WHY?!?!?! 

Yes, Jan Kiszka recently alerted me to the existence of 
https://sourceware.org/bugzilla/show_bug.cgi?id=11588
and I still have some text that you proposed (mail titled
("Pthread Condition Variables and Priority Inversion")
quite a long time ago for the pthread_cond_timedwait() page.
One day, when that page exists, I'll try to remember to add it.

> But is shipped with various
> PREEMPT_RT enabled Linux systems. Because these calls are paired, and more
> of the logic can happen on the kernel side (to preserve ownership of an
> rt_mutex with waiters), so in order to ensure userspace and kernelspace
> remain in sync, we pre-specify the target of the requeue in
> futex_wait_requeue_pi. This also limits the attack surface by only
> supporting exactly what it was meant to do. The corner cases get insane
> otherwise.

Thanks. I've added some text on pairing, based on your text above.

> We could walk through the various ways in which it would break if these
> pairing restrictions were not in place, but I'll have to take some serious
> time to page all those into working memory. Let me know if we need more
> detail here and I will.

I don't think we need that much level of detail.

>> Most of the rest of this mail is just a checklist noting
>> what I did with your comments. No response is needed
>> in most cases, but there is one that I have marked with
>> "???". If you could reply to that. I'd be grateful.
> 
> ...
> 
>>> For all the PI opcodes, we should probably mention something about the
>>> futex value scheme (TID), whereas the other opcodes do not require any
>>> specific value scheme.
>>>
>>> No Owner:	0
>>> Owner:		TID
>>> Waiters:	TID | FUTEX_WAITERS
>>>
>>> This is the relevant section from the referenced paper:
>>> 				
>>> The PI futex operations diverge from the oth-
>>> ers in that they impose a policy describing how
>>> the futex value is to be used. If the lock is un-
>>> owned, the futex value shall be 0. If owned, it
>>> shall be the thread id (tid) of the owning thread.
>>> If there are threads contending for the lock, then
>>> the FUTEX_WAITERS flag is set. With this policy in
>>> place, userspace can atomically acquire an unowned
>>> lock or release an uncontended lock using an atomic
>>> instruction and their own tid. A non-zero futex
>>> value will force waiters into the kernel to lock. The
>>> FUTEX_WAITERS flag forces the owner into the kernel
>>> to unlock. If the callers are forced into the kernel,
>>> they then deal directly with an underlying rt_mutex
>>> which implements the priority inheritance semantics.
>>> After the rt_mutex is acquired, the futex value is up-
>>> dated accordingly, before the calling thread returns
>>> to userspace.
>>>
>>> It is important to note that the kernel will update the futex value
>>> prior
>>> to returning to userspace. Unlike other futex op codes,
>>> FUTEX_CMP_REUQUE_PI (and FUTEX_WAIT_REQUEUE_PI, FUTEX_LOCK_PI are
>>> designed
>>> for the implementation of very specific IPC mechanisms).
>>
>> ??? Great text. May I presume that I can take this text
>> and freely adapt it for the man page? (Actually, this is a
>> request for forgiveness, rather than permission :-).)
> 
> Thanks, and no objection from me.

Thanks.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-17  9:16                   ` Michael Kerrisk (man-pages)
@ 2015-01-17 19:26                     ` Darren Hart
  2015-01-18 10:18                       ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 80+ messages in thread
From: Darren Hart @ 2015-01-17 19:26 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages), Thomas Gleixner
  Cc: Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Arnd Bergmann, Steven Rostedt, Peter Zijlstra, Linux API,
	Davidlohr Bueso, Jan Kiszka


On 1/17/15, 1:16 AM, "Michael Kerrisk (man-pages)"
<mtk.manpages@gmail.com> wrote:

>Hello Darren,
>
>On 01/17/2015 02:33 AM, Darren Hart wrote:
>> Corrected Davidlohr's email address.
>
>Thanks!
>
>> On 1/15/15, 7:12 AM, "Michael Kerrisk (man-pages)"
>> <mtk.manpages@gmail.com> wrote:
>> 
>>> Hello Darren,
>>>
>>> I give you the same apology as to Thomas for the
>>> long-delayed response to your mail.
>>>
>>> And I repeat my note to Thomas:
>>> In the next day or two, I hope to send out the new version
>>> of the futex(2) page for review. The new draft is a bit
>>> bigger (okay -- 4 x bigger) than the current page. And there
>>> are a quite number of FIXMEs that I've placed in the page
>>> for various points--some minor, but a few major--that need
>>> to be checked or fixed. Would you have some time to review
>>> that page?
>> 
>> I'll make the time for that. I've wanted to see this for a while, so
>>thank
>> you for working on it!
>
>Great!
>
>>> In the meantime, I have a couple of questions, which, if
>>> you could answer them, I would work some changes into the
>>> page before sending.
>>>
>>> 1. In various places, distinction is made between non-PI
>>>   futexs and PI futexes. But what determines that distinction?
>>>   From the kernel's perspective, hat make a futex one type
>>>   or another? I presume it is to do with the types of blocking
>>>   waiters on the futex, but it would be good to have a formal
>>>   definition.
>> 
>> You're right in that a uaddr is a uaddr is a uaddr. Also "there is no
>>such
>> thing as a futex", it doesn't exist as any kind of identifiable object,
>>so
>> these discussions can get rather confusing :-)
>
>So, I want to make sure that I am clear on what you mean you say this.
>You say "there is no such thing as a futex" because from the kernel's
>perspective there is no visible entity in the uncontended case
>(where everything can be dealt with in user space). And from user-space,
>in the uncontended case all we're doing is memory operations. Right?
>
>On the other hand, from a kernel perspective, we could say that a
>futex "exists" in the contended phases, since the kernel has allocated
>state associated with the uaddr. Right?


Sorry, this was more anecdotal, and probably more of a distraction than
constructive. I just meant that unlike other things which you can point to
a specific struct for (task, rt_mutex, etc.), a "futex" has it's state
distributed across the backing store (uaddr), the queue (futex_q), the
pi_state, the rt_mutex, etc, and these span kernel space and userspace.
Your description above is correct.

>
>> A "futex" becomes a PI futex when it is "created" via a PI futex op
>>code.
>
>Precisely which PI op codes? Is it: FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI, and
>FUTEX_CMP_REQUEUE_PI, and not FUTEX_WAIT_REQUEUE_PI or FUTEX_UNLOCK_PI?

Based on your wording below about taking a user POV on this, I'm going to
say "yes" here. These opcodes paired with the PI futex value policy
(described below) defines a "futex" as PI aware. These were created very
specifically in support of PI pthread_mutexes, so it makes a lot more
sense to talk about a PI aware pthread_mutex, than a PI aware futex, since
there is a lot of policy and scaffolding that has to be built up around it
to use it properly (this is what a PI pthread_mutex is).

>> At that point, the syscall will ensure a pi_state is populated for the
>> futex_q entry. See futex_lock_pi() for example. Before the locks are
>> taken, there is a call to refill_pi_state_cache() which preps a pi_state
>> for assignment later in futex_lock_pi_atomic(). This pi_state provides
>>the
>> necessary linkage to perform the priority boosting in the event of a
>> priority inversion. This is handled externally from the futexes via the
>> rt_mutex construct.
>> 
>> Clear as mud?
>
>Not quite that bad, but... The thing is, still, the man page has text
>such as the following (based on your wording):
>
>       FUTEX_CMP_REQUEUE_PI (since Linux 2.6.31)
>              This operation is a PI-aware variant of FUTEX_CMP_REQUEUE.
>              It    requeues    waiters    that    are    blocked    via
>              FUTEX_WAIT_REQUEUE_PI  on uaddr from a non-PI source futex
>              (uaddr) to a PI target futex (uaddr2).
>
>And elsewhere you said
>
>    EINVAL is returned if the non-pi to pi or
>    op pairing semantics are violated.
>
>When someone in user-land (e.g., me) reads pieces like that, they then
>want to find somewhere in the man page a description of what makes a
>futex a *PI futex* and probably some statements of the distinction
>between PI and non-PI futexes. And those statements should be from a
>perspective that is somewhat comprehensible to user-space. I'm not
>yet confident that I can do that. Do you care to take a shot at it?

Hrm, tricky indeed. From userspace, what makes a "futex" PI is the policy
agreement between kernel and userspace (which is the value of the futex:
0, TID, TID|WAITERS, and never just WAITERS, and the use of PI aware futex
op codes when making the futex syscalls.

For a longer discussion of this policy, see Documentation/pi-futex.txt.
Also note that this policy can be combined with that for robust futexes,
adding the OWNERDIED component.

--
Darren Hart
Intel Open Source Technology Center







^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-17 19:26                     ` Darren Hart
@ 2015-01-18 10:18                       ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-18 10:18 UTC (permalink / raw)
  To: Darren Hart, Thomas Gleixner
  Cc: mtk.manpages, Carlos O'Donell, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Davidlohr Bueso, Jan Kiszka

Hello Darren,

On 01/17/2015 08:26 PM, Darren Hart wrote:
> 
> On 1/17/15, 1:16 AM, "Michael Kerrisk (man-pages)"
> <mtk.manpages@gmail.com> wrote:

[...]

>>>> In the meantime, I have a couple of questions, which, if
>>>> you could answer them, I would work some changes into the
>>>> page before sending.
>>>>
>>>> 1. In various places, distinction is made between non-PI
>>>>   futexs and PI futexes. But what determines that distinction?
>>>>   From the kernel's perspective, hat make a futex one type
>>>>   or another? I presume it is to do with the types of blocking
>>>>   waiters on the futex, but it would be good to have a formal
>>>>   definition.
>>>
>>> You're right in that a uaddr is a uaddr is a uaddr. Also "there is no
>>> such
>>> thing as a futex", it doesn't exist as any kind of identifiable object,
>>> so
>>> these discussions can get rather confusing :-)
>>
>> So, I want to make sure that I am clear on what you mean you say this.
>> You say "there is no such thing as a futex" because from the kernel's
>> perspective there is no visible entity in the uncontended case
>> (where everything can be dealt with in user space). And from user-space,
>> in the uncontended case all we're doing is memory operations. Right?
>>
>> On the other hand, from a kernel perspective, we could say that a
>> futex "exists" in the contended phases, since the kernel has allocated
>> state associated with the uaddr. Right?
> 
> 
> Sorry, this was more anecdotal, and probably more of a distraction than
> constructive. I just meant that unlike other things which you can point to
> a specific struct for (task, rt_mutex, etc.), a "futex" has it's state
> distributed across the backing store (uaddr), the queue (futex_q), the
> pi_state, the rt_mutex, etc, and these span kernel space and userspace.
> Your description above is correct.

Okay. Thanks. I've added a few more words to the page noting that
the kernel maintains no state for a futex in the uncontended state.

>>> A "futex" becomes a PI futex when it is "created" via a PI futex op
>>> code.
>>
>> Precisely which PI op codes? Is it: FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI, and
>> FUTEX_CMP_REQUEUE_PI, and not FUTEX_WAIT_REQUEUE_PI or FUTEX_UNLOCK_PI?
> 
> Based on your wording below about taking a user POV on this, I'm going to
> say "yes" here. These opcodes paired with the PI futex value policy
> (described below) defines a "futex" as PI aware. These were created very
> specifically in support of PI pthread_mutexes, so it makes a lot more
> sense to talk about a PI aware pthread_mutex, than a PI aware futex, since
> there is a lot of policy and scaffolding that has to be built up around it
> to use it properly (this is what a PI pthread_mutex is).

See below.

>>> At that point, the syscall will ensure a pi_state is populated for the
>>> futex_q entry. See futex_lock_pi() for example. Before the locks are
>>> taken, there is a call to refill_pi_state_cache() which preps a pi_state
>>> for assignment later in futex_lock_pi_atomic(). This pi_state provides
>>> the
>>> necessary linkage to perform the priority boosting in the event of a
>>> priority inversion. This is handled externally from the futexes via the
>>> rt_mutex construct.
>>>
>>> Clear as mud?
>>
>> Not quite that bad, but... The thing is, still, the man page has text
>> such as the following (based on your wording):
>>
>>       FUTEX_CMP_REQUEUE_PI (since Linux 2.6.31)
>>              This operation is a PI-aware variant of FUTEX_CMP_REQUEUE.
>>              It    requeues    waiters    that    are    blocked    via
>>              FUTEX_WAIT_REQUEUE_PI  on uaddr from a non-PI source futex
>>              (uaddr) to a PI target futex (uaddr2).
>>
>> And elsewhere you said
>>
>>    EINVAL is returned if the non-pi to pi or
>>    op pairing semantics are violated.
>>
>> When someone in user-land (e.g., me) reads pieces like that, they then
>> want to find somewhere in the man page a description of what makes a
>> futex a *PI futex* and probably some statements of the distinction
>> between PI and non-PI futexes. And those statements should be from a
>> perspective that is somewhat comprehensible to user-space. I'm not
>> yet confident that I can do that. Do you care to take a shot at it?
> 
> Hrm, tricky indeed. From userspace, what makes a "futex" PI is the policy
> agreement between kernel and userspace (which is the value of the futex:
> 0, TID, TID|WAITERS, and never just WAITERS, and the use of PI aware futex
> op codes when making the futex syscalls.

Okay -- I've attempted to capture this in some text that I added to the 
page.

> For a longer discussion of this policy, see Documentation/pi-futex.txt.

Sad to say, that document doesn't supply that much more detail, in
my reading of it, at least.

> Also note that this policy can be combined with that for robust futexes,
> adding the OWNERDIED component.

Now there's two other stories that have yet to be dealt with ;-). 

I have a FIXME already in the page regarding OWNERDIED, and
get_robust_list(2) is another page that seems like it could do with 
a fair bit of work, but that's a story for another day.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-17  0:46                       ` Darren Hart
@ 2015-01-19 10:45                         ` Thomas Gleixner
  2015-01-19 14:07                           ` Michael Kerrisk (man-pages)
  2015-01-23 18:19                         ` Torvald Riegel
  1 sibling, 1 reply; 80+ messages in thread
From: Thomas Gleixner @ 2015-01-19 10:45 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Torvald Riegel, Roland McGrath, Darren Hart,
	Anton Blanchard, Petr Baudis, Eric Dumazet, bill o gallmeister,
	Jan Kiszka, Daniel Wagner, Rich Felker

On Fri, 16 Jan 2015, Darren Hart wrote:
> On 1/16/15, 12:54 PM, "Michael Kerrisk (man-pages)"
> <mtk.manpages@gmail.com> wrote:
> 
> >On 01/16/2015 04:20 PM, Thomas Gleixner wrote:
> >> On Fri, 16 Jan 2015, Michael Kerrisk (man-pages) wrote:
> >> 
> >>> Hello Thomas,
> >>>
> >>> On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
> >>>> On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
> >>>>>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
> >>>>>
> >>>>> ??? I added this, but does this error not occur only for PI requeues?
> >>>>
> >>>> It's equally wrong for normal futexes. And its actually the same code
> >>>> checking for this for all variants.
> >>>
> >>> I don't understand "equally wrong" in your reply, I'm sorry. Do you
> >>> mean:
> >>>
> >>> a) This error text should be there for both normal and PI requeues
> >> 
> >> It is there for both. The requeue code has that check independent of
> >> the requeue type (normal/pi). It never makes sense to requeue
> >> something to itself whether normal or pi futex. We added this for PI,
> >> because there it is harmful, but we did not special case it. So normal
> >> futexes get the same treatment.
> >
> >Hello Thomas, 
> >
> >Color me stupid, but I can't see this in futex_requeue(). Where is that
> >check that is "independent of the requeue type (normal/pi)"?
> >
> >When I look through futex_requeue(), all the likely looking sources
> >of EINVAL are governed by a check on the 'requeue_pi' argument.
> 
> 
> Right, in the non-PI case, I believe there are valid use cases: move to
> the back of the FIFO, for example (OK, maybe the only example?). Both
> tests ensuring uaddr1 != uaddr2 are under the requeue_pi conditional
> block. The second compares the keys in case they are not FUTEX_PRIVATE
> (uaddrs would be different, but still the same backing store).
> 
> Thomas, am I missing a test for this someplace else?

No, I had a short look at the code misread it. So, yes, it's a valid
operation for the non PI case. Sorry for the confusion.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-19 10:45                         ` Thomas Gleixner
@ 2015-01-19 14:07                           ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-19 14:07 UTC (permalink / raw)
  To: Thomas Gleixner, Darren Hart
  Cc: mtk.manpages, Carlos O'Donell, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Torvald Riegel, Roland McGrath,
	Darren Hart, Anton Blanchard, Petr Baudis, Eric Dumazet,
	bill o gallmeister, Jan Kiszka, Daniel Wagner, Rich Felker

On 01/19/2015 11:45 AM, Thomas Gleixner wrote:
> On Fri, 16 Jan 2015, Darren Hart wrote:
>> On 1/16/15, 12:54 PM, "Michael Kerrisk (man-pages)"
>> <mtk.manpages@gmail.com> wrote:
>>
>>> On 01/16/2015 04:20 PM, Thomas Gleixner wrote:
>>>> On Fri, 16 Jan 2015, Michael Kerrisk (man-pages) wrote:
>>>>
>>>>> Hello Thomas,
>>>>>
>>>>> On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
>>>>>> On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
>>>>>>>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>>>>>>>
>>>>>>> ??? I added this, but does this error not occur only for PI requeues?
>>>>>>
>>>>>> It's equally wrong for normal futexes. And its actually the same code
>>>>>> checking for this for all variants.
>>>>>
>>>>> I don't understand "equally wrong" in your reply, I'm sorry. Do you
>>>>> mean:
>>>>>
>>>>> a) This error text should be there for both normal and PI requeues
>>>>
>>>> It is there for both. The requeue code has that check independent of
>>>> the requeue type (normal/pi). It never makes sense to requeue
>>>> something to itself whether normal or pi futex. We added this for PI,
>>>> because there it is harmful, but we did not special case it. So normal
>>>> futexes get the same treatment.
>>>
>>> Hello Thomas, 
>>>
>>> Color me stupid, but I can't see this in futex_requeue(). Where is that
>>> check that is "independent of the requeue type (normal/pi)"?
>>>
>>> When I look through futex_requeue(), all the likely looking sources
>>> of EINVAL are governed by a check on the 'requeue_pi' argument.
>>
>>
>> Right, in the non-PI case, I believe there are valid use cases: move to
>> the back of the FIFO, for example (OK, maybe the only example?). Both
>> tests ensuring uaddr1 != uaddr2 are under the requeue_pi conditional
>> block. The second compares the keys in case they are not FUTEX_PRIVATE
>> (uaddrs would be different, but still the same backing store).
>>
>> Thomas, am I missing a test for this someplace else?
> 
> No, I had a short look at the code misread it. So, yes, it's a valid
> operation for the non PI case. Sorry for the confusion.

Thanks for the confirmation, Thomas. Page updated to remove 
FUTEX_CMP_REQUEUE from that error case.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-17  0:46                       ` Darren Hart
  2015-01-19 10:45                         ` Thomas Gleixner
@ 2015-01-23 18:19                         ` Torvald Riegel
  2015-01-24 10:05                           ` Thomas Gleixner
  1 sibling, 1 reply; 80+ messages in thread
From: Torvald Riegel @ 2015-01-23 18:19 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Carlos O'Donell, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Darren Hart, Anton Blanchard,
	Petr Baudis, Eric Dumazet, bill o gallmeister, Jan Kiszka,
	Daniel Wagner, Rich Felker

On Fri, 2015-01-16 at 16:46 -0800, Darren Hart wrote:
> On 1/16/15, 12:54 PM, "Michael Kerrisk (man-pages)"
> <mtk.manpages@gmail.com> wrote:
> 
> >Color me stupid, but I can't see this in futex_requeue(). Where is that
> >check that is "independent of the requeue type (normal/pi)"?
> >
> >When I look through futex_requeue(), all the likely looking sources
> >of EINVAL are governed by a check on the 'requeue_pi' argument.
> 
> 
> Right, in the non-PI case, I believe there are valid use cases: move to
> the back of the FIFO, for example (OK, maybe the only example?).

But we never guarantee a futex is a FIFO, or do we?  If we don't, then
such a requeue could be implemented as a no-op by the kernel, which
would sort of invalidate the use case.

(And I guess we don't want to guarantee FIFO behavior for futexes.)



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-15 15:10             ` Michael Kerrisk (man-pages)
  2015-01-15 22:23               ` Thomas Gleixner
@ 2015-01-23 18:29               ` Torvald Riegel
  2015-01-24 11:35                 ` Thomas Gleixner
  1 sibling, 1 reply; 80+ messages in thread
From: Torvald Riegel @ 2015-01-23 18:29 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Thomas Gleixner, Carlos O'Donell, Darren Hart, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Davidlohr Bueso, Arnd Bergmann,
	Steven Rostedt, Peter Zijlstra, Linux API, Darren Hart,
	Anton Blanchard, Petr Baudis, Eric Dumazet, bill o gallmeister,
	Jan Kiszka, Daniel Wagner, Rich Felker

On Thu, 2015-01-15 at 16:10 +0100, Michael Kerrisk (man-pages) wrote:
> [Adding a few people to CC that have expressed interest in the 
> progress of the updates of this page, or who may be able to
> provide review feedback. Eventually, you'll all get CCed on
> the new draft of the page.]
> 
> Hello Thomas,
> 
> On 05/15/2014 04:14 PM, Thomas Gleixner wrote:
> > On Thu, 15 May 2014, Michael Kerrisk (man-pages) wrote:
> >> And that universe would love to have your documentation of 
> >> FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),
> > 
> > I give you almost the full treatment, but I leave REQUEUE_PI to
> > Darren and FUTEX_WAKE_OP to Jakub. :)
> 
> Thank you for the great effort you put into compiling the
> text below, and apologies for my long delay in following up.
> 
> I've integrated almost all of your suggestions into the 
> manual page. I will shortly send out a new draft of the
> page that contains various FIXMEs for points that remain 
> unclear.

Michael, thanks for working on the draft!  I'll review the draft closely
once you've sent it (or have I missed it?).

There are a few things that I'd like to see covered.

First, we should discuss that users, until they control all code in the
respective process, need to expect futexes to be affected by spurious
futex_wake calls; see https://lkml.org/lkml/2014/11/27/472 for
background and Linus' choice (AFAIU) to just document this.
So, for example, a return code of 0 for FUTEX_WAIT can mean either being
woken up by a FUTEX_WAKE intended for this futex, or a stale one
intended for another futex used by, for example, glibc internally.
(Note that as explained in this thread, this isn't just a glibc
artifact, but a result of the general futex design mixed with
destruction requirements for mutexes and other constructs in C++11 and
POSIX.)
It might also be necessary to further consider this when documenting the
errors, because it does affect how to handle them. See this for a glibc
perspective:
https://sourceware.org/ml/libc-alpha/2014-09/msg00381.html

Second, the current documentation for EINTR is that it can happen due to
receiving a signal *or* due to a spurious wake-up.  This is difficult to
handle when implementing POSIX semaphores, because they require that
EINTR is returned from SEM_WAIT if and only if the interruption was due
to a signal.  Thus, if FUTEX_WAIT returns EINTR, the semaphore
implementation can't return EINTR from sem_wait; see this for more
comments, including some discussion why use cases relying on the POSIX
requirement around EINTR are borderline timing-dependent:
https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/sem_waitcommon.c;h=96848d7ac5b6f8f1f3099b422deacc09323c796a;hb=HEAD#l282
Others have commented that aio_suspend has a similar issue; if EINTR
wouldn't in fact be returned spuriously, the POSIX-implementation-side
would get easier.

Third, I think it would be useful to -- somewhere -- explain which
behavior the futex operations would have conceptually when expressed by
C11 code.  We currently say that they wake up, sleep, etc, and which
values they return.  But we never say how to properly synchronize with
them on the userspace side.  The C11 memory model is probably the best
model to use on the userspace side, so that's why I'm arguing for this.
Basically, I think we need to (1) tell people that they should use
memory_order_relaxed accesses to the futex variable (ie, the memory
location associated with the whole futex construct on the kernel side --
or do we have another name for this?), and (2) give some conceptual
guarantees for the kernel-side synchronization so that one use this to
derive how to use them correctly in userspace.

The man pages might not be the right place for this, and maybe we just
need a revision of "Futexes are tricky".  If you have other suggestions
for where to document this, or on the content, let me know.  (I'm also
willing to spend time on this :) ).


Torvald





^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-23 18:19                         ` Torvald Riegel
@ 2015-01-24 10:05                           ` Thomas Gleixner
  2015-01-24 12:58                             ` Torvald Riegel
  0 siblings, 1 reply; 80+ messages in thread
From: Thomas Gleixner @ 2015-01-24 10:05 UTC (permalink / raw)
  To: Torvald Riegel
  Cc: Darren Hart, Michael Kerrisk (man-pages),
	Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Darren Hart, Anton Blanchard, Petr Baudis,
	Eric Dumazet, bill o gallmeister, Jan Kiszka, Daniel Wagner,
	Rich Felker

On Fri, 23 Jan 2015, Torvald Riegel wrote:

> On Fri, 2015-01-16 at 16:46 -0800, Darren Hart wrote:
> > On 1/16/15, 12:54 PM, "Michael Kerrisk (man-pages)"
> > <mtk.manpages@gmail.com> wrote:
> > 
> > >Color me stupid, but I can't see this in futex_requeue(). Where is that
> > >check that is "independent of the requeue type (normal/pi)"?
> > >
> > >When I look through futex_requeue(), all the likely looking sources
> > >of EINVAL are governed by a check on the 'requeue_pi' argument.
> > 
> > 
> > Right, in the non-PI case, I believe there are valid use cases: move to
> > the back of the FIFO, for example (OK, maybe the only example?).
> 
> But we never guarantee a futex is a FIFO, or do we?  If we don't, then
> such a requeue could be implemented as a no-op by the kernel, which
> would sort of invalidate the use case.
> 
> (And I guess we don't want to guarantee FIFO behavior for futexes.)

The (current) behaviour is:

    real-time threads:   FIFO per priority level
    sched-other threads: FIFO independent of nice level

The wakeup is priority ordered. Highest priority level first.

Thanks,

	tglx




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-23 18:29               ` Torvald Riegel
@ 2015-01-24 11:35                 ` Thomas Gleixner
  2015-01-24 13:12                   ` Torvald Riegel
  2015-02-05 19:57                   ` Darren Hart
  0 siblings, 2 replies; 80+ messages in thread
From: Thomas Gleixner @ 2015-01-24 11:35 UTC (permalink / raw)
  To: Torvald Riegel
  Cc: Michael Kerrisk (man-pages),
	Carlos O'Donell, Darren Hart, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Davidlohr Bueso, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Darren Hart, Anton Blanchard,
	Petr Baudis, Eric Dumazet, bill o gallmeister, Jan Kiszka,
	Daniel Wagner, Rich Felker

On Fri, 23 Jan 2015, Torvald Riegel wrote:
> Second, the current documentation for EINTR is that it can happen due to
> receiving a signal *or* due to a spurious wake-up.  This is difficult to

I don't think so. I went through all callchains again with a fine comb.

futex_wait()
retry:
	ret = futex_wait_setup();
	if (ret) {
		 /*
		  * Possible return codes related to uaddr:
		  * -EINVAL:    Not u32 aligned uaddr
		  * -EFAULT:    No mapping, no RW
		  * -ENOMEM:    Paging ran out of memory
		  * -EHWPOISON: Memory hardware error
		  *
		  * Others:
		  * -EWOULDBLOCK: value at uaddr has changed
		  */
		return ret;
	}

	futex_wait_queue_me();

	if (woken by futex_wake/requeue)
	   	return 0;

	if (timeout)
		return -ETIMEOUT;

	/*
	 * Spurious wakeup, i.e. no signal pending
	 */
	if (!signal_pending())
		goto retry;

	/* Handled in the low level syscall exit code */
	if (!timed_wait)
		return -ERESTARTSYS;
	else
		return -ERESTARTBLOCK;

Now in the low level syscall exit we try to deliver the signal

	if (!signal_delivered())
	      restart_syscall();

	if (sigaction->flags & SA_RESTART)
	      restart_syscall();

	ret_to_userspace -EINTR;

So we should never see -EINTR in the case of a spurious wakeup here.

But, here is the not so good news:

 I did some archaeology. The restart handling of futex_wait() got
 introduced in kernel 2.6.22, so anything older than that will have
 the spurious -EINTR issues.

futex_wait_pi() always had the restart handling and glibc folks back
then (2006) requested that it should never return -EINTR, so it
unconditionally restarts the syscall whether a signal had been
delivered or not.

So kernels >= 2.6.22 should never return -EINTR spuriously. If that
happens it's a bug and needs to be fixed.

> Third, I think it would be useful to -- somewhere -- explain which
> behavior the futex operations would have conceptually when expressed by
> C11 code.  We currently say that they wake up, sleep, etc, and which
> values they return.  But we never say how to properly synchronize with
> them on the userspace side.  The C11 memory model is probably the best
> model to use on the userspace side, so that's why I'm arguing for this.
> Basically, I think we need to (1) tell people that they should use
> memory_order_relaxed accesses to the futex variable (ie, the memory
> location associated with the whole futex construct on the kernel side --
> or do we have another name for this?), and (2) give some conceptual
> guarantees for the kernel-side synchronization so that one use this to
> derive how to use them correctly in userspace.
> 
> The man pages might not be the right place for this, and maybe we just
> need a revision of "Futexes are tricky".  If you have other suggestions
> for where to document this, or on the content, let me know.  (I'm also
> willing to spend time on this :) ).

The current futex code in the kernel has gained documentation about
the required memory ordering recently. That should be a good starting
point.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-24 10:05                           ` Thomas Gleixner
@ 2015-01-24 12:58                             ` Torvald Riegel
  2015-01-24 16:25                               ` Thomas Gleixner
  0 siblings, 1 reply; 80+ messages in thread
From: Torvald Riegel @ 2015-01-24 12:58 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Darren Hart, Michael Kerrisk (man-pages),
	Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Arnd Bergmann, Steven Rostedt, Peter Zijlstra, Linux API,
	Darren Hart, Anton Blanchard, Eric Dumazet, bill o gallmeister,
	Jan Kiszka, Daniel Wagner, Rich Felker

On Sat, 2015-01-24 at 11:05 +0100, Thomas Gleixner wrote:
> On Fri, 23 Jan 2015, Torvald Riegel wrote:
> 
> > On Fri, 2015-01-16 at 16:46 -0800, Darren Hart wrote:
> > > On 1/16/15, 12:54 PM, "Michael Kerrisk (man-pages)"
> > > <mtk.manpages@gmail.com> wrote:
> > > 
> > > >Color me stupid, but I can't see this in futex_requeue(). Where is that
> > > >check that is "independent of the requeue type (normal/pi)"?
> > > >
> > > >When I look through futex_requeue(), all the likely looking sources
> > > >of EINVAL are governed by a check on the 'requeue_pi' argument.
> > > 
> > > 
> > > Right, in the non-PI case, I believe there are valid use cases: move to
> > > the back of the FIFO, for example (OK, maybe the only example?).
> > 
> > But we never guarantee a futex is a FIFO, or do we?  If we don't, then
> > such a requeue could be implemented as a no-op by the kernel, which
> > would sort of invalidate the use case.
> > 
> > (And I guess we don't want to guarantee FIFO behavior for futexes.)
> 
> The (current) behaviour is:
> 
>     real-time threads:   FIFO per priority level
>     sched-other threads: FIFO independent of nice level
> 
> The wakeup is priority ordered. Highest priority level first.

OK.

But, just to be clear, do I correctly understand that you do not want to
guarantee FIFO behavior in the specified futex semantics?  I think there
are cases where being able to *rely* on FIFO (now and on all future
kernels) would be helpful for users (e.g., on POSIX/C++11 condvars and I
assume in other ordered-wakeup cases too) -- but at the same time, this
would constrain future futex implementations.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-24 11:35                 ` Thomas Gleixner
@ 2015-01-24 13:12                   ` Torvald Riegel
  2015-01-27  7:48                     ` Michael Kerrisk (man-pages)
  2015-02-05 19:57                   ` Darren Hart
  1 sibling, 1 reply; 80+ messages in thread
From: Torvald Riegel @ 2015-01-24 13:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Michael Kerrisk (man-pages),
	Carlos O'Donell, Darren Hart, Ingo Molnar, Jakub Jelinek,
	linux-man, lkml, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Darren Hart, Anton Blanchard, Eric Dumazet,
	bill o gallmeister, Jan Kiszka, Daniel Wagner, Rich Felker

On Sat, 2015-01-24 at 12:35 +0100, Thomas Gleixner wrote:
> So we should never see -EINTR in the case of a spurious wakeup here.
> 
> But, here is the not so good news:
> 
>  I did some archaeology. The restart handling of futex_wait() got
>  introduced in kernel 2.6.22, so anything older than that will have
>  the spurious -EINTR issues.
> 
> futex_wait_pi() always had the restart handling and glibc folks back
> then (2006) requested that it should never return -EINTR, so it
> unconditionally restarts the syscall whether a signal had been
> delivered or not.
> 
> So kernels >= 2.6.22 should never return -EINTR spuriously. If that
> happens it's a bug and needs to be fixed.

Thanks for looking into this.

Michael, can you include the above in the documentation please?  This is
useful for userspace code like glibc that expects a minimum kernel
version.  Thanks!


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-24 12:58                             ` Torvald Riegel
@ 2015-01-24 16:25                               ` Thomas Gleixner
  0 siblings, 0 replies; 80+ messages in thread
From: Thomas Gleixner @ 2015-01-24 16:25 UTC (permalink / raw)
  To: Torvald Riegel
  Cc: Darren Hart, Michael Kerrisk (man-pages),
	Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Arnd Bergmann, Steven Rostedt, Peter Zijlstra, Linux API,
	Darren Hart, Anton Blanchard, Eric Dumazet, bill o gallmeister,
	Jan Kiszka, Daniel Wagner, Rich Felker

On Sat, 24 Jan 2015, Torvald Riegel wrote:
> On Sat, 2015-01-24 at 11:05 +0100, Thomas Gleixner wrote:
> > On Fri, 23 Jan 2015, Torvald Riegel wrote:
> > 
> > > On Fri, 2015-01-16 at 16:46 -0800, Darren Hart wrote:
> > > > On 1/16/15, 12:54 PM, "Michael Kerrisk (man-pages)"
> > > > <mtk.manpages@gmail.com> wrote:
> > > > 
> > > > >Color me stupid, but I can't see this in futex_requeue(). Where is that
> > > > >check that is "independent of the requeue type (normal/pi)"?
> > > > >
> > > > >When I look through futex_requeue(), all the likely looking sources
> > > > >of EINVAL are governed by a check on the 'requeue_pi' argument.
> > > > 
> > > > 
> > > > Right, in the non-PI case, I believe there are valid use cases: move to
> > > > the back of the FIFO, for example (OK, maybe the only example?).
> > > 
> > > But we never guarantee a futex is a FIFO, or do we?  If we don't, then
> > > such a requeue could be implemented as a no-op by the kernel, which
> > > would sort of invalidate the use case.
> > > 
> > > (And I guess we don't want to guarantee FIFO behavior for futexes.)
> > 
> > The (current) behaviour is:
> > 
> >     real-time threads:   FIFO per priority level
> >     sched-other threads: FIFO independent of nice level
> > 
> > The wakeup is priority ordered. Highest priority level first.
> 
> OK.
> 
> But, just to be clear, do I correctly understand that you do not want to
> guarantee FIFO behavior in the specified futex semantics?  I think there
> are cases where being able to *rely* on FIFO (now and on all future
> kernels) would be helpful for users (e.g., on POSIX/C++11 condvars and I
> assume in other ordered-wakeup cases too) -- but at the same time, this
> would constrain future futex implementations.

It would be a constraint, but I don't think it would be a horrible
one. Though I have my doubts, that we can actually guarantee it under
all circumstances.

One thing comes to my mind right away: spurious wakeups. There is no
way that we can guarantee FIFO ordering in the context of those. And
we cannot prevent them either.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-24 13:12                   ` Torvald Riegel
@ 2015-01-27  7:48                     ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 80+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-27  7:48 UTC (permalink / raw)
  To: Torvald Riegel, Thomas Gleixner
  Cc: mtk.manpages, Carlos O'Donell, Darren Hart, Ingo Molnar,
	Jakub Jelinek, linux-man, lkml, Arnd Bergmann, Steven Rostedt,
	Peter Zijlstra, Linux API, Darren Hart, Anton Blanchard,
	Eric Dumazet, bill o gallmeister, Jan Kiszka, Daniel Wagner,
	Rich Felker

Hello Torvald,

On 01/24/2015 02:12 PM, Torvald Riegel wrote:
> On Sat, 2015-01-24 at 12:35 +0100, Thomas Gleixner wrote:
>> So we should never see -EINTR in the case of a spurious wakeup here.
>>
>> But, here is the not so good news:
>>
>>  I did some archaeology. The restart handling of futex_wait() got
>>  introduced in kernel 2.6.22, so anything older than that will have
>>  the spurious -EINTR issues.
>>
>> futex_wait_pi() always had the restart handling and glibc folks back
>> then (2006) requested that it should never return -EINTR, so it
>> unconditionally restarts the syscall whether a signal had been
>> delivered or not.
>>
>> So kernels >= 2.6.22 should never return -EINTR spuriously. If that
>> happens it's a bug and needs to be fixed.
> 
> Thanks for looking into this.
> 
> Michael, can you include the above in the documentation please?  This is
> useful for userspace code like glibc that expects a minimum kernel
> version.  Thanks!

I've added some text to my draft to cover this point.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2015-01-24 11:35                 ` Thomas Gleixner
  2015-01-24 13:12                   ` Torvald Riegel
@ 2015-02-05 19:57                   ` Darren Hart
  1 sibling, 0 replies; 80+ messages in thread
From: Darren Hart @ 2015-02-05 19:57 UTC (permalink / raw)
  To: Thomas Gleixner, Torvald Riegel
  Cc: Michael Kerrisk (man-pages),
	Carlos O'Donell, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Darren Hart, Anton Blanchard, Petr Baudis,
	Eric Dumazet, bill o gallmeister, Jan Kiszka, Daniel Wagner,
	Rich Felker

On 1/24/15, 3:35 AM, "Thomas Gleixner" <tglx@linutronix.de> wrote:

>On Fri, 23 Jan 2015, Torvald Riegel wrote:
>> Second, the current documentation for EINTR is that it can happen due to
>> receiving a signal *or* due to a spurious wake-up.  This is difficult to
>
>I don't think so. I went through all callchains again with a fine comb.
>
>futex_wait()
>retry:
>	ret = futex_wait_setup();
>	if (ret) {
>		 /*
>		  * Possible return codes related to uaddr:
>		  * -EINVAL:    Not u32 aligned uaddr
>		  * -EFAULT:    No mapping, no RW
>		  * -ENOMEM:    Paging ran out of memory
>		  * -EHWPOISON: Memory hardware error
>		  *
>		  * Others:
>		  * -EWOULDBLOCK: value at uaddr has changed
>		  */
>		return ret;
>	}
>
>	futex_wait_queue_me();
>
>	if (woken by futex_wake/requeue)
>	   	return 0;
>
>	if (timeout)
>		return -ETIMEOUT;
>
>	/*
>	 * Spurious wakeup, i.e. no signal pending
>	 */
>	if (!signal_pending())
>		goto retry;
>
>	/* Handled in the low level syscall exit code */
>	if (!timed_wait)
>		return -ERESTARTSYS;
>	else
>		return -ERESTARTBLOCK;
>
>Now in the low level syscall exit we try to deliver the signal
>
>	if (!signal_delivered())
>	      restart_syscall();
>
>	if (sigaction->flags & SA_RESTART)
>	      restart_syscall();
>
>	ret_to_userspace -EINTR;
>
>So we should never see -EINTR in the case of a spurious wakeup here.
>
>But, here is the not so good news:
>
> I did some archaeology. The restart handling of futex_wait() got
> introduced in kernel 2.6.22, so anything older than that will have
> the spurious -EINTR issues.
>
>futex_wait_pi() always had the restart handling and glibc folks back
>then (2006) requested that it should never return -EINTR, so it
>unconditionally restarts the syscall whether a signal had been
>delivered or not.
>
>So kernels >= 2.6.22 should never return -EINTR spuriously. If that
>happens it's a bug and needs to be fixed.
>
>> Third, I think it would be useful to -- somewhere -- explain which
>> behavior the futex operations would have conceptually when expressed by
>> C11 code.  We currently say that they wake up, sleep, etc, and which
>> values they return.  But we never say how to properly synchronize with
>> them on the userspace side.  The C11 memory model is probably the best
>> model to use on the userspace side, so that's why I'm arguing for this.
>> Basically, I think we need to (1) tell people that they should use
>> memory_order_relaxed accesses to the futex variable (ie, the memory
>> location associated with the whole futex construct on the kernel side --
>> or do we have another name for this?), and (2) give some conceptual
>> guarantees for the kernel-side synchronization so that one use this to
>> derive how to use them correctly in userspace.
>> 
>> The man pages might not be the right place for this, and maybe we just
>> need a revision of "Futexes are tricky".  If you have other suggestions
>> for where to document this, or on the content, let me know.  (I'm also
>> willing to spend time on this :) ).
>
>The current futex code in the kernel has gained documentation about
>the required memory ordering recently. That should be a good starting
>point.

Lots of paging in here... If I recall correctly there was something about
not being able to return to userspace in these events without owning the
lock (waiters but no owner, breaking pi chains and promotion, etc.), so
restarting was the preferable path.

-- 
Darren Hart
Intel Open Source Technology Center




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: futex(2) man page update help request
  2014-05-15 19:38             ` Darren Hart
  2014-08-11 10:19               ` chrubis
  2014-11-26 13:41               ` Cyril Hrubis
@ 2015-02-16 13:14               ` Cyril Hrubis
  2 siblings, 0 replies; 80+ messages in thread
From: Cyril Hrubis @ 2015-02-16 13:14 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages),
	Thomas Gleixner, Ingo Molnar, Jakub Jelinek, linux-man, lkml,
	Davidlohr Bueso, Arnd Bergmann, Steven Rostedt, Peter Zijlstra,
	Linux API, Carlos O'Donell

Hi!
> I'll follow up with you in a couple weeks most likely. I have some urgent
> things that will be taking all my time and then some until then. Feel free
> to poke me though if I lose track of it :-)

FYI I've started to work on futex testcases for LTP. The first batch has
been commited in:

https://github.com/linux-test-project/ltp/commit/6270ba2ebe999ffdb1364e5e814d7e56070a0198

Some of these are losely based on futextest some are written from
scratch. The requeue operation, pi futexes and bitset are not covered
yet.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2015-02-16 13:14 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-14 10:35 futex(2) man page update help request Michael Kerrisk (man-pages)
2014-05-14 16:18 ` Darren Hart
2014-05-14 19:03   ` Michael Kerrisk (man-pages)
2014-05-14 19:59     ` Darren Hart
2014-05-14 20:23     ` Carlos O'Donell
2014-05-14 20:44       ` Andy Lutomirski
2014-05-14 23:34       ` Thomas Gleixner
2014-05-15  3:12         ` Carlos O'Donell
2014-05-15  4:49           ` Michael Kerrisk (man-pages)
2014-05-15  4:53         ` Michael Kerrisk (man-pages)
2014-05-15 14:14           ` Thomas Gleixner
2014-05-15 20:19             ` Michael Kerrisk (man-pages)
2014-08-04 14:46               ` Carlos O'Donell
2014-05-15 20:35             ` Darren Hart
2015-01-15 15:12               ` Michael Kerrisk (man-pages)
2015-01-17  1:33                 ` Darren Hart
2015-01-17  9:16                   ` Michael Kerrisk (man-pages)
2015-01-17 19:26                     ` Darren Hart
2015-01-18 10:18                       ` Michael Kerrisk (man-pages)
2015-01-15 15:10             ` Michael Kerrisk (man-pages)
2015-01-15 22:23               ` Thomas Gleixner
2015-01-16 15:17                 ` Michael Kerrisk (man-pages)
2015-01-16 15:20                   ` Thomas Gleixner
2015-01-16 20:54                     ` Michael Kerrisk (man-pages)
2015-01-17  0:46                       ` Darren Hart
2015-01-19 10:45                         ` Thomas Gleixner
2015-01-19 14:07                           ` Michael Kerrisk (man-pages)
2015-01-23 18:19                         ` Torvald Riegel
2015-01-24 10:05                           ` Thomas Gleixner
2015-01-24 12:58                             ` Torvald Riegel
2015-01-24 16:25                               ` Thomas Gleixner
2015-01-17  0:56                       ` Davidlohr Bueso
2015-01-17  1:11                         ` Darren Hart
2015-01-23 18:29               ` Torvald Riegel
2015-01-24 11:35                 ` Thomas Gleixner
2015-01-24 13:12                   ` Torvald Riegel
2015-01-27  7:48                     ` Michael Kerrisk (man-pages)
2015-02-05 19:57                   ` Darren Hart
2014-05-15  8:13       ` Peter Zijlstra
2014-05-15 15:43         ` Darren Hart
2014-05-15  8:14       ` Peter Zijlstra
2014-05-15 13:18         ` Carlos O'Donell
2014-05-15 13:22           ` Peter Zijlstra
2014-05-15 13:49             ` Michael Kerrisk (man-pages)
2014-05-15 13:55               ` Peter Zijlstra
2014-05-15 14:39               ` Carlos O'Donell
2014-05-15 15:11                 ` Peter Zijlstra
2014-05-14 20:56     ` Davidlohr Bueso
2014-05-14 21:03       ` Darren Hart
2014-05-14 22:21         ` Paul E. McKenney
2014-05-15  0:28       ` H. Peter Anvin
2014-05-15  0:35         ` Andy Lutomirski
2014-05-15  0:41           ` H. Peter Anvin
2014-05-15 19:10         ` Carlos O'Donell
2014-05-14 21:05   ` Davidlohr Bueso
2014-05-15 15:15     ` Joseph S. Myers
2014-05-15  0:18   ` H. Peter Anvin
2014-05-15  5:21     ` Darren Hart
2014-05-15  8:23       ` Peter Zijlstra
2014-05-15 13:46       ` Michael Kerrisk (man-pages)
2014-05-15 14:59         ` H. Peter Anvin
2014-05-15 15:42         ` chrubis
2014-05-15 15:52           ` H. Peter Anvin
2014-05-15 16:01             ` chrubis
2014-05-15 16:07               ` H. Peter Anvin
2014-05-15 16:17                 ` chrubis
2014-05-15 16:56                   ` H. Peter Anvin
2014-05-15 17:06                     ` chrubis
2014-05-15 15:47         ` Darren Hart
2014-05-15 15:35     ` chrubis
2014-05-15 15:28   ` chrubis
2014-05-15 15:40     ` Steven Rostedt
2014-05-15 16:14     ` Darren Hart
2014-05-15 16:30       ` chrubis
2014-05-15 18:17         ` Darren Hart
2014-05-15 19:05           ` chrubis
2014-05-15 19:38             ` Darren Hart
2014-08-11 10:19               ` chrubis
2014-11-26 13:41               ` Cyril Hrubis
2015-02-16 13:14               ` Cyril Hrubis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).