Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Andrea Arcangeli <aarcange@redhat.com>
Cc: syzbot <syzbot+96cc7aba7e969b1d305c@syzkaller.appspotmail.com>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: INFO: task hung in pipe_read (2)
Date: Sat, 1 Aug 2020 10:39:00 -0700	[thread overview]
Message-ID: <CAHk-=wj15SDiHjP2wPiC=Ru-RrUjOuT4AoULj6N_9pVvSXaWiw@mail.gmail.com> (raw)
In-Reply-To: <fc097a54-0384-9d21-323f-c3ca52cdb956@I-love.SAKURA.ne.jp>

[-- Attachment #1: Type: text/plain, Size: 1210 bytes --]

On Sat, Aug 1, 2020 at 8:30 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Waiting for response at https://lkml.kernel.org/r/45a9b2c8-d0b7-8f00-5b30-0cfe3e028b28@I-love.SAKURA.ne.jp .

I think handle_userfault() should have a (shortish) timeout, and just
return VM_FAULT_RETRY.

The code is overly complex anyway, because it predates the "just return RETRY".

And because we can't wait forever when the source of the fault is a
kernel exception, I think we should add some extra logic to just say
"if this is a retry, we've already done this once, just return an
error".

This is a TEST PATCH ONLY. I think we'll actually have to do something
like this, but I think the final version might need to allow a couple
of retries, rather than just give up after just one second.

But for testing your case, this patch might be enough to at least show
that "yeah, this kind of approach works".

Andrea? Comments? As mentioned, this is probably much too aggressive,
but I do think we need to limit the time that the kernel will wait for
page faults.

Because userfaultfd has become a huge source of security holes as a
way to time kernel faults or delay them indefinitely.

                     Linus

[-- Attachment #2: patch --]
[-- Type: application/octet-stream, Size: 1829 bytes --]

 fs/userfaultfd.c | 35 ++++++++++++-----------------------
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 52de29000c7e..bd739488bb29 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -473,6 +473,16 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
 		goto out;
 	}
 
+	/*
+	 * If this is a kernel fault, and we're retrying, consider
+	 * it fatal. Otherwise we have deadlocks and other nasty
+	 * stuff.
+	 */
+	if (vmf->flags & FAULT_FLAG_TRIED) {
+		if (WARN_ON_ONCE(!(vmf->flags & FAULT_FLAG_USER)))
+			goto out;
+	}
+
 	/*
 	 * Handle nowait, not much to do other than tell it to retry
 	 * and wait.
@@ -516,33 +526,12 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
 						       vmf->flags, reason);
 	mmap_read_unlock(mm);
 
+	/* We'll wait for up to a second, and then return VM_FAULT_RETRY */
 	if (likely(must_wait && !READ_ONCE(ctx->released) &&
 		   !userfaultfd_signal_pending(vmf->flags))) {
 		wake_up_poll(&ctx->fd_wqh, EPOLLIN);
-		schedule();
+		schedule_timeout(HZ);
 		ret |= VM_FAULT_MAJOR;
-
-		/*
-		 * False wakeups can orginate even from rwsem before
-		 * up_read() however userfaults will wait either for a
-		 * targeted wakeup on the specific uwq waitqueue from
-		 * wake_userfault() or for signals or for uffd
-		 * release.
-		 */
-		while (!READ_ONCE(uwq.waken)) {
-			/*
-			 * This needs the full smp_store_mb()
-			 * guarantee as the state write must be
-			 * visible to other CPUs before reading
-			 * uwq.waken from other CPUs.
-			 */
-			set_current_state(blocking_state);
-			if (READ_ONCE(uwq.waken) ||
-			    READ_ONCE(ctx->released) ||
-			    userfaultfd_signal_pending(vmf->flags))
-				break;
-			schedule();
-		}
 	}
 
 	__set_current_state(TASK_RUNNING);

  reply	other threads:[~2020-08-01 17:39 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-01  8:45 syzbot
2020-08-01 15:29 ` Tetsuo Handa
2020-08-01 17:39   ` Linus Torvalds [this message]
2020-08-07  5:31     ` Andrea Arcangeli
2020-08-08  1:01       ` Tetsuo Handa
2020-08-10 19:29         ` Andrea Arcangeli
2020-08-13  7:00           ` Tetsuo Handa
2020-08-13 11:20             ` Tetsuo Handa
2020-08-22  4:34               ` [RFC PATCH] pipe: make pipe_release() deferrable Tetsuo Handa
2020-08-22 16:30                 ` Linus Torvalds
2020-08-23  0:21                   ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wj15SDiHjP2wPiC=Ru-RrUjOuT4AoULj6N_9pVvSXaWiw@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=aarcange@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=syzbot+96cc7aba7e969b1d305c@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    --subject='Re: INFO: task hung in pipe_read (2)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).