LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Peter Oskolkov <posk@google.com>
Cc: Jann Horn <jannh@google.com>, Peter Oskolkov <posk@posk.io>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	Paul Turner <pjt@google.com>, Ben Segall <bsegall@google.com>,
	Andrei Vagin <avagin@google.com>,
	Thierry Delisle <tdelisle@uwaterloo.ca>
Subject: Re: [PATCH 2/4 v0.5] sched/umcg: RFC: add userspace atomic helpers
Date: Tue, 14 Sep 2021 10:07:00 +0200	[thread overview]
Message-ID: <YUBYJLCYpy3yJO5F@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CAPNVh5eaW7r_Nv-wHEyxQiFkXngmONwPyZSFvtTEhk3TxJ+iMA@mail.gmail.com>

On Thu, Sep 09, 2021 at 12:06:58PM -0700, Peter Oskolkov wrote:
> On Wed, Sep 8, 2021 at 4:39 PM Jann Horn <jannh@google.com> wrote:
> 
> Thanks a lot for the reviews, Jann!
> 
> I understand how to address most of your comments. However, one issue
> I'm not sure what to do about:
> 
> [...]
> 
> > If this function is not allowed to sleep, as the comment says...
> 
> [...]
> 
> > ... then I'm pretty sure you can't call fix_pagefault() here, which
> > acquires the mmap semaphore (which may involve sleeping) and then goes
> > through the pagefault handling path (which can also sleep for various
> > reasons, like allocating memory for pagetables, loading pages from
> > disk / NFS / FUSE, and so on).
> 
> <quote from peterz@ from
> https://lore.kernel.org/lkml/20210609125435.GA68187@worktop.programming.kicks-ass.net/>:
>   So a PF_UMCG_WORKER would be added to sched_submit_work()'s PF_*_WORKER
>   path to capture these tasks blocking. The umcg_sleeping() hook added
>   there would:
> 
>     put_user(BLOCKED, umcg_task->umcg_status);
>     ...
> </quote>
> 
> Which is basically what I am doing here: in sched_submit_work() I need
> to read/write to userspace; and we cannot sleep in
> sched_submit_work(), I believe.
> 
> If you are right that it is impossible to deal with pagefaults from
> within non-sleepable contexts, I see two options:
> 
> Option 1: as you suggest, pin pages holding struct umcg_task in sys_umcg_ctl;
> 
> or
> 
> Option 2: add more umcg-related kernel state to task_struct so that
> reading/writing to userspace is not necessary in sched_submit_work().

Durr.. so yeah this is a bit of a chicken and egg problem here. We need
a userspace page to notify we're blocked, but at the same time,
accessing said page can get us blocked.

And then worse, as Jann said, we cannot do this in the appropriate spot
because we could be blocking on mmap_sem, so we must not require
mmap_sem to make progress etc.. :/

Now, in reality actually taking a fault for these pages is extremely
unlikely, but if we do, there's really no option but to block and wait
for it without notification. Tought luck there.

So what we can do, is use get_user_page() on the appropriate pages
(alignment ensure the whole umcg struct must be in a single page etc..)
the moment a umcg task enters the kernel. For this we need some
SYSCALL_WORK_ENTER flag.

So normally a task would have ->umcg_page and ->umcg_server_page be
NULL, the above SYSCALL_WORK_SYSCALL_UMCG flag would get_user_page() the
self and server pages. If get_user_page() blocks, these fields would
still be NULL and sched_submit_work() would not do anything, c'est la
vie.

Once we have the pages, any actual blocking hitting sched_submit_work()
can do the updates without further blocking. It can then also put_page()
and clear the ->umcg_{,server_}page pointers, because the task_work that
will set RUNNABLE *can* suffer mmap_sem (again, unlikely, again tough
luck if it does).

The reason for put'ing the pages on blocking, is that this guarantees
the pages are only pinned for a short amount of time, and 'never' by a
blocked task. IOW, it's a proper transient pin and doesn't require extra
special care or accounting.



Also, can you *please* convert that RST crud to a text file, it's
absolutely unreadable gunk. Those documentation files should be readable
as plain text first and foremost. That whole rendering to html crap is
nonsense. Using a browser to read a test file is insane.

  parent reply	other threads:[~2021-09-14  8:10 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-08 18:49 [PATCH 0/4 v0.5] sched/umcg: RFC UMCG patchset Peter Oskolkov
2021-09-08 18:49 ` [PATCH 1/4 v0.5] sched/umcg: add WF_CURRENT_CPU and externise ttwu Peter Oskolkov
2021-09-08 18:49 ` [PATCH 2/4 v0.5] sched/umcg: RFC: add userspace atomic helpers Peter Oskolkov
2021-09-08 23:38   ` Jann Horn
2021-09-09  1:16     ` Jann Horn
2021-09-09 19:06     ` Peter Oskolkov
2021-09-09 21:20       ` Jann Horn
2021-09-09 22:09         ` Peter Oskolkov
2021-09-09 23:13           ` Jann Horn
2021-09-14 16:52         ` Andy Lutomirski
2021-09-14 18:11           ` Peter Zijlstra
2021-09-14 18:40             ` Andy Lutomirski
2021-09-15 15:42               ` Peter Zijlstra
2021-09-15 16:50                 ` Andy Lutomirski
2021-09-15 19:10                   ` Peter Zijlstra
2021-09-14  8:07       ` Peter Zijlstra [this message]
2021-09-14 16:29         ` Peter Oskolkov
2021-09-14 18:04           ` Peter Zijlstra
2021-09-14 18:15             ` Peter Zijlstra
2021-09-14 18:29             ` Peter Oskolkov
2021-09-14 18:48               ` Peter Oskolkov
2021-09-08 18:49 ` [PATCH 3/4 v0.5] sched/umcg: RFC: implement UMCG syscalls Peter Oskolkov
2021-09-09  1:39   ` Jann Horn
2021-09-14 16:51     ` Peter Oskolkov
2021-09-08 18:49 ` [PATCH 4/4 v0.5] sched/umcg: add Documentation/userspace-api/umcg.rst Peter Oskolkov
2021-09-14 16:35   ` Tao Zhou
2021-09-14 16:57     ` Peter Oskolkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUBYJLCYpy3yJO5F@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=avagin@google.com \
    --cc=bsegall@google.com \
    --cc=jannh@google.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pjt@google.com \
    --cc=posk@google.com \
    --cc=posk@posk.io \
    --cc=tdelisle@uwaterloo.ca \
    --cc=tglx@linutronix.de \
    --subject='Re: [PATCH 2/4 v0.5] sched/umcg: RFC: add userspace atomic helpers' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).